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FOREWORD 


This  Final  Report  presents  the  results  of  an  eight-month 
project  on  the  design  and  application  of  an  inffitftntiai 
processor.  The  work  on  this  project,  conducted  under  Grant  AFOSR 
81—0115 r  commenced  on  1  April  1981  and  was  completed  on  30  Novem¬ 
ber,  1981. 

The  research  was  carried  out  in  the  Department  of  Electrical 
Engineering  at  the  University  of  Kentucky.  Those  principally 
involved  were  F.  M.  Brown  (principal  investigator),  D.  K.  Taylor, 
and  H.  R.  Rowlette;  the  latter  two  are  graduate  students  who  were 
supported  by  funds  provided  by  the  University  of  Kentucky. 
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ABSTRACT 


Results  are  presented  concerning  the  design  and  application 
of  an  inferential  processor,  a  digital  machine  organized  to 
process  logical  data  at  high  rates  of  speed.  When  coupled  to  a 
general-purpose  digital  computer,  the  inferential  processor  would 
enable  reasoning  tasks  to  be  carried  out  rapidly  and  with  little 
programming  effort.  Specific  research-efforts  discussed  in  this 
report  are  (a)  mechanized  inference  in  Boolean  systems,  (b) 
functional  deduction,  and  (c)  inferential  analysis  of  relational 
databases. 
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I.  INTRODUCTION 


The  objective  of  the  research  described  in  this  report  has 
been  to  investigate  the  design  and  application  of  an  inferential 
processor .  a  machine  specialized  for  rapid  processing  of  Boolean 
(i.e.,  propositional)  data.  This  research  is  part  of  a  longer- 
term  effort  to  mechanize  a  new  approach  to  reasoning  in 
propositional  logic.  The  basic  ideas  underlying  this  approach 
have  been  worked  out  over  a  period  of  several  years;  the  practi¬ 
cal  implementation  of  those  ideas  was  first  undertaken  in  1980, 
however,  while  the  principal  investigator  was  at  the  Air  Force 
Avionics  Laboratory  under  the  sponsorship  of  the  USAF/SCEEE  Sum¬ 
mer  Faculty  Research  Program. 

The  proposed  inferential  processor,  which  is  intended  to 
augment  the  computational  power  of  a  general-purpose  computer,  is 
to  be  a  high-speed  reasoning  system  having  very  general  capa¬ 
bility  within  the  domain  of  propositional  logic.  It  may  be  imple¬ 
mented  either  by  microprogramming  a  general-purpose  computer  or 
by  attaching  to  such  a  computer  a  special-purpose  processor;  the 
latter  implementation  [1]  is  assumed  in  this  report. 

Our  research  during  the  grant-period  has  been  organized  into 
the  following  tasks: 

1.  Mechanized  Inference  in  Boolean  Systems  (F.M.  Brown); 

2.  Functional  Deduction  (F.M.  Brown); 

3.  Boolean  Analysis  of  Relational  Databases  (D.K.  Taylor); 

4.  Simulation  of  the  Inferential  Processor  (M.R.  Rowlette) . 
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The  foregoing  research-tasks  were  undertaken  as  eight-month 
efforts  promising  the  greatest  progress  toward  the  objectives 
stated  in  our  proposal.  He  present  in  this  report  the  results  of 
the  first  three  tasks;  the  results  of  the  final  task  are  to 
appear  in  an  M.S.  thesis  which  is  currently  underway. 

The  objective  of  the  first  task,  Mechanized  Inference  in 
Boolean  Systems,  was  to  develop  an  organized  and  coherent  theory 
of  Boolean  analysis.  The  basic  inferential  operations  on  systems 
of  Boolean  equations  were  studied,  terminology  was  established, 
and  properties  fundamental  to  the  operation  of  the  inferential 
processor  were  proved.  The  objects  of  the  first  task  were  princi¬ 
pally  those  of  clarification,  terminology,  and  proof.  Ir,  the  se¬ 
cond  task.  Functional  Deduction,  our  object  was  to  investigate  a 
new  application  of  the  processor,  one  which  had  only  been  sket¬ 
ched  in  our  previous  research  [2J.  We  believe  functional  deduc¬ 
tion  to  be  a  fundamental  operation  in  Boolean  analysis;  it  is  the 
inverse,  essentially,  of  the  much-studied  problem  of  solving 
Boolean  equations.  The  results  obtained  under  this  task  enable 
functional  deduction  to  be  performed  rapidly  and  efficiently  by 
the  inferential  processor.  To  study  its  essential  features  and 
illustrate  its  practical  utility,  we  have  applied  functional 
deduction  to  the  design  of  economical  multiple-output  combinatio¬ 
nal  circuits. 

The  objective  of  the  third  task.  Boolean  Analysis  of  Rela¬ 
tional  Databases,  was  to  investigate  potential  applications  of 
the  inferential  processor  to  database  processing.  He  began  by 
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studying  the  problems  associated  with  relational  databases.  This 
study  showed  that  the  generation  of  keys  for  a  database  is  a 
difficult  problem  of  practical  importance.  The  keys  may  be  deter¬ 
mined  if  the  functional  dependencies  associated  with  the  database 
are  known;  we  therefore  devised  an  algorithm  (the  first  to  our 
knowledge)  for  generating  the  functional  dependencies  in  a  given 
relational  database.  This  algorithm  also  produces  the  full  set  of 
minimal  keys  for  the  database.  The  algorithm  was  programmed 
entirely  in  the  logical  language  PROLOG  [3,4],  which  was  used  for 
two  reasons:  first,  this  language  is  most  effective  for  program¬ 
ming  tasks  involving  logic;  secondr  PROLOG  is  an  "inferential 
processor"  in  software,  whose  operation  we  wished  to  study. 

This  report  is  organized  in  two  parts.  Part  A  includes 
general  background  on  logical  computers  and  some  discussion  of 
the  motivation  for  our  research  (Section  II) r  a  brief  description 
of  the  structure  of  the  inferential  processor  (Section  III),  and 
discussions  of  the  results  obtained  under  tasks  1  and  2  above 
(Sections  IV  and  V),  Part  B,  originally  prepared  as  an  M.S. 
thesis  [5],  presents  the  results  obtained  under  task  3. 


3 


fiefaxansaa 


1.  Brown,  F.  M.,  "Inferential  Processor,"  Final  Report, 
AFOSR/SCEEE  Summer  Faculty  Research  Program,  August  1980. 

2.  Brown,  F.  M.,  "High-Speed  Reasoning  in  Propositional 
Logic,"  Proposal  to  AF  Office  of  Scientific  Research, 
July  1981. 

3.  Roussel,  P.,  "PROLOG:  manuel  de  reference  et  d'utilization,” 
Groupe  d'Intellig ence  Artif icielle,  Universite  d'Aix- 
Harseille,  Luminy,  France,  September  1975. 

4.  ciocksin,  w.f.  and  c.s.  Meiiish,  fxaaiammina  in  £xc iaa« 

N.Y.:  Springer-Ver lag,  1981. 

5.  Taylor,  D.K.,  "Analyzing  Relational  Databases  Using  Proposi¬ 
tional  Logic,"  H.  S.  Thesis,  Department  of  Electrical 
Engineering,  University  of  Kentucky,  December,  1981. 


i 


i 

■* 

J 


II.  LOGICAL  COMPUTERS 


We  outline  in  this  section  the  motivation  for  our  research, 
whose  ultimate  object  is  to  produce  a  logical  computer,  i.e.,  a 
machine  capable  of  high-speed  inferential  processing  in  proposi¬ 
tional  (Boolean)  logic.  Some  of  the  material  in  the  present 
section  is  taken  from  a  proposal  [1J  prepared  during  the  grant- 
period;  it  is  included  in  this  report  for  completeness. 

Propositional  logic  may  be  identified  roughly  with  two¬ 
valued  Boolean  algebra.  This  form  of  logic  has  applications  in 
many  areas,  a  few  of  which  are  logical  design  [2],  the  diagnosis 
of  failures  in  digital  systems  [3,4],  and  the  design  of 
relational  databases  [5].  It  is  the  basis,  moreover,  for 
reasoning  in  higher-order  logics  such  as  the  first-order 
predicate  calculus;  the  latter  is  required  for  applications  in 
artificial  intelligence  [6,7],  The  propositional  calculus  is 
related  to  the  higher-order  logics  in  somewhat  the  same  way  that 
arithmetic  is  related  to  the  various  fields  of  mathematical 
analysis;  it  is  a  structure,  useful  in  itself,  on  which  more 
elaborate  structures  are  built. 

The  range  of  application  of  the  propositional  calculus  was 
outlined  by  Ledley  [8]  as  follows:  "The  propositional  calculus 
can  be  applied  to  many  phases  of  military  science  and  related 
problems  as  well  as  to  business,  industry,  science,  and 
government  in  general.  In  these  applications  it  serves  as  an  aid 
to  complex  reasoning,  e.g.,  in  the  analysis  and  evaluation  of 
intelligence  reports,  the  preparation  and  analysis  of  tactical 


methods  and  principles,  the  formulation  and  interpretation  of 
legal  statutes,  the  planning  and  evaluation  of  chemical  and 
biological  experiments,  the  formulation  of  psychological  and 
intelligence  examinations,  and  the  formulation  and  evaluation  of 
business  methods  and  procedures.  All  of  these  and  similar 
'reasoning'  activities  and  operations  can  use  the  propositional 
calculus  in  a  fundamental  way.  More  well-known  are  its 
applications  to  the  design  of  industrial  process-control 
machines,  digital  computers,  large-scale  switching  circuitry,  and 
other  forms  of  information-handling  systems.  However,  the  compu¬ 
tational  methods  of  the  propositional  calculus  present  serious 
and  frequently  insurmountable  difficulties  in  the  solution  of  ac¬ 
tual  problems,  and  this  factor  has  severely  limited  its  practical 
utilization.  Consequently  the  need  arises  for  a  systematic  way  of 
formulating,  analyzing  and  solving  propositional  functions  and 
equations. " 

Notwithstanding  the  "logical*  nature  of  its  internal  oper¬ 
ations,  a  general-purpose  computer  is  ill-suited  to  logical  com¬ 
putation.  For  this  reason,  a  number  of  dedicated  logical  proces¬ 
sors  have  been  proposed.  For  a  detailed  study  of  logic-machines, 
from  the  Ars  Magna  of  Ramon  Lull  in  the  thirteenth  century  to  the 
relay-machines  of  the  1950's,  see  Gardner  [9].  The  electronic  ma¬ 
chines  relevant  to  the  present  project  may  be  put  into  three 
classes:  axanfflanx-stxiliaxar  asaaiian-aaisaxaf  and  fgxonia- 
mlniaixeia . 
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All  of  the  argument-verifiers  [10-16]  known  to  this  investi¬ 
gator  (and  none  of  the  other  kinds  of  logical  computers)  have 
been  designed  by  logicians.  The  function  of  any  such  machine  is 
to  decide  the  validity  of  an  argument,  i.e.,  a  collection  of 
premises  together  with  a  conclusion.  Equation-solving  machines 
[17-22],  on  the  other  hand,  have  been  inspired  principally  by  the 
need  to  solve  technological  problems.  Such  machines  accept  some 
representation  of  a  system  of  Boolean  equations  and  produce  a 
solution  (typically  particular  rather  than  general)  for  a 
selected  subset  of  the  arguments  in  terms  of  the  remaining  argu¬ 
ments. 


Formula-minimizing  machines  [23-27]  have  the  common  aim  of 
determining  simplified  sum-of-products  (disjunctive  normal  form) 
expressions  for  propositional  functions.  The  procedures  imple¬ 
mented  in  all  of  these  machines  are  based  on  Quine's  formulation 
[28-30]  of  the  minimization  problem,  the  essential  feature  of 
which  is  the  generation  of  the  fiiiajfi  impiicqnts  of  the  given 
propositional  function.  Formula-minimization  may  at  first  glance 
appear  to  have  little  bearing  on  mechanized  inference.  It  is  sig¬ 
nificant,  however,  for  two  reasons.  First,  formula-minimization 
is  useful  in  the  application  of  other  reasoning  processes, 
improving  the  economy  and  perspicuity  of  the  results  obtained. 
Second,  the  existing  designs  for  formula-minimizing  machines 
represent  solutions  of  a  problem  that  is  dominant  in  the  design 
of  the  proposed  inferential  processor,  namely,  that  of  generating 
and  storing  the  prime  implicants  of  a  propositional  function. 
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Each  of  the  machines  cited  above  carries  out  a  species  of 
reasoning;  each  extracts  useful  information,  that  is,  from  a  col¬ 
lection  of  propositional  data.  None  of  these  machines  has  emerged 
from  the  laboratory  of  its  birth,  however,  because  none  is  in  any 
sense  "general-purpose"  within  the  domain  of  propositional  logic. 
The  element  absent  in  these  machines  is  a  cen-tral  ptinc-iple  gf 
reasoning,  readily  adaptable  to  argument-verification,  equation¬ 
solving,  formula-minimization,  and  any  other  task  involving  logi¬ 
cal  inference.  The  proposed  inferential  processor  embodies  such  a 
principle,  viz.,  that  the  BliOfi  CfiafifiSUMCSa  characterize,  in  a 
simple  and  economical  way,  all  conclusions  deducible  from  a 
collection  of  propositional  data. 

The  technique  of  automated  inference  we  are  investigating  is 
based  on  a  formulation  given  by  A.  Blake  in  a  little-known  dis¬ 
sertation  [31]  published  in  1937.  The  concept  of  a  prime  impli- 
cant,  customarily  attributed  to  a  paper  published  by  Quine  [28] 
fourteen  years  later,  as  well  as  all  of  the  presently-known  me¬ 
thods  for  generating  prime  implicants,  were  presented  in  Blake's 
dissertation.  The  application  of  prime  implicants  to  formula- 
minimization  was  pointed  out  by  Quine  and  has  since  been 
intensively  studied  and  applied;  Blake's  application  of  prime 
implicants  to  logical  deduction,  however,  has  apparently  remained 
unnoticed.  Blake's  principal  contribution  was  to  show  that  a 
single  rule  of  inference,  that  cf  Hypothetical  Syllogism  (if  P 
implies  Q  and  Q  implies  R,  then  P  implies  R) ,  suffices  to  produce 
all  of  the  prime  consequences  of  a  collection  of  propositional 
data.  Expressed  in  terms  of  Boolean  algebra,  the  single  operation 


of  consensus  (which  Blake  called  the  ”syllogistic  result”)  suf¬ 
fices  to  produce  all  of  the  prime  implicants  of  a  Boolean  func¬ 
tion.  This  idea  is  closely  related  to  the  'resolution  principle" 
given  by  Robinson  [32]  in  1965  and  now  applied  in  mechanical 
theorem-  proving  [6]  and  in  programming  languages,  such  as  PROLOG 
[33,34],  designed  to  solve  problems  in  the  predicate  calculus. 
When  compared  with  Blake's  use  of  consensus,  the  resolution  prin¬ 
ciple  is  formulated  in  a  more  general  structure  (the  first-order 
predicate  calculus)  and  is  applied  to  a  less  general  problem 
(theorem-proving  by  refutation). 

Blake  demonstrated  the  fundamental  role  of  the  prime  conse¬ 
quences  in  generating  and  verifying  conclusions.  Our  research  has 
shown  an  additional  advantage  of  the  prime  consequences,  namely, 
that  they  enable  the  fundamental  operations  of  propositional  in¬ 
ference  (e.g.,  elimination  of  variables,  solution  of  equations  in 
general  and  particular  form,  general  of  functional  antecedents 
and  consequents)  to  be  conveniently  mechanized  in  a  high-speed 
processor.  Thus  a  machine  that  accepts  propositional  data  and 
produces  their  prime  consequences  can  be  made  general-purpose  in 
the  domain  of  propositional  logic. 
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III.  ORGANIZATION  OF  THE  INFERENTIAL  PROCESSOR 


We  present  in  this  section  a  brief  outline  of  the  organiza¬ 
tion  of  the  proposed  inferential  processor;  see  (1]  for  a  more 
complete  description. 

The  function  of  the  inferential  processor  is  to  accept, 
store,  and  process  Boolean  or  propositional  data.  It  is  intended 
to  function  as  a  high-speed  adjunct  to  a  general-purpose  compu¬ 
ter,  as  indicated  in  Figure  1. 


Fig.  1.  Total  system. 

The  applications  anticipated  for  the  inferential  processor 
fall  into  two  main  classes:  (a)  tasks  involving  only  propositi¬ 
onal  (Boolean)  logic  and  (b)  tasks  involving  higher-order  logic, 
primarily  the  first-order  predicate  calculus.  The  first  class 
includes  such  applications  as  computer-aided  design  of  logic- 
circuits  (2],  the  design  and  analysis  of  databases  (see  [3], 
which  is  included  as  Part  B  of  this  report),  and  on-line  diagno¬ 
sis  of  faults  in  digital  systems  [4].  In  the  second  class  of 
applications,  unlike  the  first,  the  inferential  processor  does 
not  do  all  of  the  logical  work;  instead,  it  provides  high-speed 
subroutines  for  use  by  the  general-purpose  computer.  The  employ¬ 
ment  of  the  inferential  processor  in  the  latter  class  of  applica- 
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tions  is  based  on  the  fact  that  higher-order  logics  employ  propo¬ 
sitional  logic  as  their  basic  Arithmetic.”  Most  applications  of 
this  class  come  under  the  heading  of  artificial  intelligencer 
many  branches  of  which  depend  heavily  on  the  first-order  predi¬ 
cate  calculus. 

fxanciBfli  Sansansota 

The  major  components  of  the  inferential  processor  are  shown 
in  Figure  2. 


Fig.  2.  Major  components  of  the  inferential  processor. 


The  unit  labelled  TERM  is  a  register  that  holds  the  term 
(Boolean  product)  currently  under  consideration.  The  Minterm  Pro¬ 
cessor  accepts  terms  from  TERMr  building  from  them  a  Boolean 
function  F(X2****rXn)  using  AND*  OR,  NOT,  EOR,  etc.  The  function 
F  is  represented  in  the  Minterm  Processor  by  its  minterm  canoni¬ 
cal  form.  The  Term  Processor  accepts  the  minterms  of  F  from  the 
Term  Processor  and  generates  the  Blake  canonical  form,  i.e.,  the 
disjunction  of  the  prime  implicants,  of  F.  The  Term  Processor 


carries  out  the  fundamental  operations  of  logical  analysis  (elim¬ 
ination  of  variables,  solution  of  equations,  etc.)  which, 
arranged  in  programmed  sequences,  carry  out  the  processing  re¬ 
quested  by  the  general-purpose  computer. 

ttaifll  Erases  Si  operation 

The  operation  of  the  inferential  processor  takes  place  in 
three  major  phases:  reduction,  development,  and  analysis.  The 
reduction-phase,  carried  out  in  the  Minterm  Processor,  reduces  a 
system  of  logical  equations  to  a  single  equation  having  the  form 
F  *  0.  The  development-phase,  carried  out  in  the  Term  Processor, 
generates  a  representation  of  P  in  Blake  canonical  form.  The  an¬ 
alysis-phase,  carried  out  in  both  processors,  executes  the  se¬ 
quence  of  inferential  operations  requested  by  the  general-purpose 
computer. 
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IV.  MECHANIZED  INFERENCE  IN  BOOIEAN  SYSTEMS 

The  task  of  the  proposed  Inferential  processor  is  to  accept  logical 
data  in  the  form  of  a  Boolean  system  and  to  generate  useful  inferences  (con¬ 
clusions)  from  such  a  system.  We  have  attempted  in  this  project  to  develop  a 
systematic  formulation  of  (a)  the  properties  of  Boolean  systems  and  (b)  the 
principal  operations  on  such  systems  that  are  of  use  in  logical  inference .  We 
discuss  that  formulation  in  this  section. 

Review  of  Elementary  Properties 
The  equivalences 


a  $  b 

=£>  ab  =  0 

(1) 

a  =  b  ■<= 

=t-  a  ©b  =  0 

(2) 

[a  =  0 

and  b  =  Ol 

a  +  b  =  0 

(3) 

[a  =1 

and  b  =  l3 

=>•  ab  =  1 

W 

are  valid  for  arbitrary  elements  a  and  b  in  a  Boolean  algebra.  Equivalences 
(3)  and  (4)  have  obvious  extensions  to  more  than 

[a  =  0  and  b=0  and  a  =  0j  is  equivalent  to  £a  +  b 

Boolean  Systems 

An  n- variable  Boolean  system  on  a  Boolean 

Sl(x)  =  h1(x) 

Sk(x)  =  h^x) 

Sm(25)  *hm(x) 

of  simultaneously-asserted  equations  and  inclusions  in  which  the  g's  and  h’s 

are  n-variable  Boolean  functions  on  B  and  x  denotes  the  vector  (x.. , . . .  ,x  ). 

—  I  n 


three  variables;  thus 

+  c  =  Oj,  etc. 

algebra  B  is  a  collection 

(5) 
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The  number)  k,  of  equations  may  be  zero  in  a  system,  as  may  the  number,  m,  of 
inclusions!  we  require,  of  course,  that  there  be  at  least  one  equation  or  one 
inclusion  in  a  Boolean  system. 

Solutions.  An  element  b  of  Bn  is  a  solution  of  the  system  (5)  if  each 
of  the  statements  in  (5)  becomes  an  identity  under  the  substitution  x  =  b .  A 
Boolean  system  is  said  to  be  consistent  if  it  has  at  least  one  solution!  other¬ 
wise,  it  is  said  to  be  inconsistent. 

Implication  and  equivalence .  Let  and  be  two  n- variable  Boolean 
systems  on  B.  We  say  that  implies  S^,  written  S^sri^S^,  in  case  the  state- 


(Vb  £  Bn)  Qj  is  a  solution  of  s^-  b  is  a  solution  of 

is  true.  Note  that  implies  any  n- variable  Boolean  system  if  is  inconsis¬ 
tent.  We  say  that  two  Boolean  systems  and  are  equivalent,  written 

Sg,  if  each  implies  the  other,  i.e.,  if  each  has  the  same  set  of  solu¬ 
tions.  Any  two  inconsistent  systems,  in  particular,  are  equivalent. 


Beductlon 

By  (l)  and  (2),  the  system  (5)  is  equivalent  to  the  system 

g1(x)  ©  hx(x)  =  0 


6k(x)  ©h^x)  =  0 

®k+l^  ^k+l^  =  0 


s„(i)  \(*>  -  0  . 


System  (6)  is  equivalent,  by  (3),  to  the  single  equation 

f(x)  =  0,  (7) 

where  f  is  a  Boolean  function  defined  by 

fg  n| 

f  =  2.  (g,®  h  )  +  i  g  h,  .  (8) 

iz«  1  1  i*k*,  1  1 

By  similar  reasoning,  invoking  (k)  instead  of  (3) ,  we  deduce  that  the 
system  (5)  is  equivalent  to  the  single  equation 

F(x)  =  1  ,  (9) 

where  F  is  a  Boolean  function  defined  by 

K  w 

F  =  T  (g,®  h  )  •  TT(i,+  h  )  .  (10) 

Any  Boolean  system  can  therefore  be  "boiled  down"  to  a  single  equation 
of  the  form  (7)  or  of  the  form  (9).  Vfe  will  focus  principally  on  the  form  (7). 

Example  1.  The  system 

ax  =  b  +  y 
ab  $  ax  +  y 

is  equivalent  to  the  system 

abxy  +  ab  +  ay  +  bx  +  xy  =  0 
ab(ay  +  xy)  =  0  , 

which  is  equivalent,  in  turn,  to  the  single  equation 

abxy  +  ab  +  ay  +  bx  +  xy  +  abxy  =  0. 
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Example  2.  The  behavior  of  an  AND- gate, 


is  described  by  the  three  equivalent  statements 

UV  =  V  (11) 

UW  +  VW  +  UVW  a  0  (12) 

uw  +  vw  +  uvw  =  i  .  (13) 

Boolean  Relations 

Given  a  Boolean  algebra  B  and  a  vector  x  =  (x^, . . , ,xr) ,  a  relation 
(or  constraint)  on  x  is  a  statement  that  confines  x  to  lie  within  a  subset  of 
Bn.  The  operation  of  the  AND-gate  of  Example  2,  for  instance,  is  specified  by 
the  relation 

(U,V,W)  fi  {(0,0,0), (0,1,0), (1,0,0), (1,1,1)}  ,  (14) 

where  B  =  {0,l}  and  x  =  (U,V,W) .  (Strictly  speaking,  the  relation  is  the  sub¬ 
set  {(0,0,0) , (0,1,0) , (1,0,0) ,(l,l,i)}  itself j  it  is  convenient  for  our  present 
purposes,  however,  to  call  the  statement  (l4)  a  relation.) 

Two  relation- statements  on  x  =  (x^,...,x  )  will  be  called  equivalent 
if  they  confine  x  to  the  same  subset  of  Bn.  Thus  statement  (14)  above  is  equi¬ 
valent  to  equation  (ll) ,  as  well  as  to  equations  (12)  and  (13) • 

An  identity  on  x  =  (x^,...,xn)  is  a  relation  equivalent  to  the  state¬ 
ment 

x  6  3n, 

An  identity,  in  other  words,  is  a  relation  or.  x  that  doesn't  really  "confine" 
x  at  all.  The  relations  x^  «  x£  and  +  *2  =  *1  +  x^  ,  for  example,  are 
both  identities  on  (x^.x^). 
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A  relation  on  x  =  (x^, . . . ,xn)  will  be  called  a  Boolean  relation  if  it 
is  equivalent  to  a  Boolean  equation,  i.e.,  if  it  is  equivalent  to  a  statement 
of  the  form 

f(x)  =  0  , 

where  f:  Bn-^B  is  a  Boolean  function.  If  B  =  (0,l],  then  every  relation  on 
x  is  a  Boolean  relation.  If  B  is  a  Boolean  algebra  larger  than  {0,l},  then  not 
all  relations  axe  Boolean.  Suppose  B  ={o,l,a,a}.  Then  the  relation 

Urx2)  6  {(0,0),  (a,0) }  (15) 

is  a  Boolean  relation  because  it  is  equivalent  to  the  Boolean  equation 

ax^  +  x2  =  0  .  (l6) 

The  set  {(0,0) ,  (a,0)}  of  solutions  of  (l6),  that  is,  is  precisely  the  set 
defining  the  relation  (15).  The  relation 

(X;L,x2)  €  £(0,0),  (a,l) }  ,  (17) 

on  the  other  hand,  is  not  a  Boolean  relation.  It  is  not  equivalent,  that  is, 

to  a  Boolean  equation;  any  Boolean  equation  f(x^,x2)  = 0  on  B  =  {o,l,a,a}  having 
solutions  (0,0)  and  (a,l)  must  also  have  solutions  (0,a)  and  (a, a) — a  -  shai. 
be  able  to  show  after  we  discuss  the  solution  of  Boolean  equations. 

Ellminants 

Let  f;  Bn-^B  be  a  Boolean  function  expressed  in  terms  of  arguments 
x1,...,xn.  We  derive  from  f  a  set  {^f  j  T  £  {x^.-.x  }}  of  Boolean  func¬ 
tions  by  applying  the  following  rules: 

(i)  fyf  =  f 

CtXi]f  =  f(0,x2 . *,).,•(!. Xj, . xn) 

°RUSf  *  °R<°S  f>  • 
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derive  another  set,  [D^f  |  T  S{x^, . . .  ,xn)}  of  Boolean  functions  by 


applying  the  rules 


(1)  D*f=f 


0 >x« , • • • ,x  )  +  f ( 1 , x„ , . . . ,x  ) 


DRUS^  =  DR^DS  ’ 

We  call  C^,  f  the  conjunctive  ellmlnant ,  and  DT  f  the  disjunctive  ellml- 
nant,  of  f  with  respect  to  the  subset  T  of  {x^,...,xn}.  Note  that  If  x  is  a 
single  letter,  then  the  conjunctive  and  disjunctive  eliminants  of  f  with  re¬ 
spect  to  x  sure  related  to  the  discriminants  f-  and  f  (discussed  in  Chapter  a) 


as  follows: 


Cr  f  =  f_.f 
ixj  X  X 

Dr  if  »f,  +f 
ixj  X  x 


(15b) 


It  is  convenient  to  omit  set-braces  in  specifying  eliminants;  thus  we 

write  C  „  f  rather  than  Cr  if. 

X-jX^  Ix^x^j 

Suppose  the  subset  T  comprises  k  elements  (k  £  n)  of  {x^,...,xr}  (we 
assume  without  loss  of  generality  that  T  comprises  the  first  k  elements,  i.e., 
that  T  =  [x^ . x^})  .  Then  C,p  f  and  D^f  are  determined  as  follows; 

C  f  =  "IT  f(b,x.  x  ) 

T  b€[0,l}k  ^+1 


V  Vl . Xn> 


If  k  =  2  and  n=4,  for  example,  then 

G  f(w.x,y,z)  =  f(0,0,y,z)  •  f(0,l,y,z)  •  f(l,0,y,z)  •  f(l,l,y,z) 


D  f(w,x,y,z)  =  f(0,0,y,z)  +f(0,l,y,z) +f(l,0,y,z) +f(l,l,y,z)  . 


I 
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It  is  clear  that  the  conjunctive  and  disjunctive  eliminants  of  a  Boolean 
function  f  with  respect  to  a  subset  T  of  the  argument- set  may  be  expressed  by 
formulas  not  involving  any  of  the  arguments  in  T.  The  process  of  calculating 
such  formulas  may  in  some  cases  be  greatly  simplified  by  application  of  the 
two  theorems  which  follow. 


Theorem  1.  Let  fj  B 


n 


B  be  a  Boolean  function  expressed  in  terms  of 


arguments  x-,...,xn.  Then 

3CF(C  f)  =  2  (terms  of  ECF(f)  not  involving  x.,  or  x. ) 

X-,  — - 


Proof ;  The  literals  and  x^  may  be  factored  from  the  terms  of  BCF(f)  in 
which  they  appear,  in  such  a  way  that  f  is  expressed  as 

L.  M  H 

f  ‘  £  Vl(x2 . xn>  +  ^i(‘! . xn>  +  g,rk‘*2 . • 

where  p^, . . .  .p^.q^, . . .  ,q^  . q^  are  terms  (products)  not  involving  the 

argument  x.  .  Thus  G  f  =  f(0,xo, . . . ,x  )f(l,xot . . . ,x  )  may  be  expressed  as 
U  N  MX1  N  2  U  2Nn 

FI  P,  +  Z.rirl[Z<1i  +  J=  iZp.li  +  Z.r>-  Eve^y  consensus  formed  by 

*-£«  *  *mi  j*i  J  Kml  »SI  j*«  J 

terms  of  BCF(f)  is  absorbed  by  a  term  of  BCF(f) .  In  particular,  every  consen- 

u  M  H 

sus  of  the  form  p.q.  is  absorbed  by  one  of  the  r- terms:  thus  5?p.q.  ^  s  r,  , 

1  •>  n  t5>  lh  £. k 

and  we  conclude  that  G  f  =  2  r> •  Thus  C  f  may  be  expressed  as  the  portion 

X1  *•  i  K  X1 

of  BCF(f)  that  remains  after  every  term  Involving  x^  or  x^  is  deleted.  It  is 
shown  in  Chapter  4  that  the  result  of  such  deletion  Is  in  Blake  canonical 
form  ■ 


Corollary  1.1.  Let  f:  Bn— ►B  be  a  Boolean  function  expressed  in  terms 
of  arguments  x^,...,xn  and  let  T  be  a  subset  of  {x^,,..,x^}.  Then 

BCF(C^,f)  =  2 (terms  of  BCF(f)  not  involving  arguments  in  T) .  (19) 
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Proof »  By  Theorem  1,  (19)  is  valid  if  #T  =  1,  i.e.,  if  T  is  a  singleton- set. 
Suppose  (19)  to  be  valid  if  #T  =  k,  and  consider  the  case  #T  =  k  +  l,  i.e.,  let 
T  =  £xj}  UR,  where  #l=k  and  x^  R.  Then  BCF(C^f)  =  BCF(G£x  -jCC^f)  )  = 

V.  ( terms  of  BCF(G^  f)  not  involving  x^  or  x^)  =  ^*.( terms  of  of  BCF(f) 

not  involving  arguments  in  R)  not  involving  x^  or  x^) .  Thus  (19)  is  valid  for 
T  =  (x^  UR  I 

Example  J.  The  system 

w  +  x  =  y 
x  +  y  =  wz 

is  equivalent  to  the  single  equation 

xy  +  wxy  +wz+wy+yz+wx+xz=0  , 

whose  left  side,  f,  is  expressed  in  Blake  canonical  form.  The  conjunctive 
eliminants  expressed  below  are  constructed  by  inspection  of  3CF(f),  using 
Theorem  1  and  its  corollary. 

y  -  «i  +  iy  +  yi  C^f  =  Cw(Cxf)  =  yz 

Gwf  =  xy  +  yz  +  x£  y  =  Cx(CHf)  =  ;'£  . 

Theorem  2.  Let  fs  Bn — *»  B  be  a  Boolean  function  expressed  in  terms  of 
arguments  x^, . . . ,xn.  Then  Dx  f  is  obtained  from  any  SOP  formula  for  f  by  re¬ 
placing  xx  and  x^,  wherever  they  appear  in  the  formula,  by  1. 

Proof »  By  definition,  D  f(x1 ,x?, . . . ,x  )  =  f(0,x_,...,x  )  +  f(l,x9, . . . ,x  )  . 
'a*ra*°*a*'  »  n  n  &  n 

An  SOP  formula  for  f  may  be  expanded  in  the  form 

f(xl'x2 . Xn5  =  ;ip(x2 . V  txll(x2 . x„>  **<*2 . *„)• 

where  p,  q,  and  r  axe  SOP  formulas  not  involving  x..  ,  hence,  f(0,x,,...,x  'I  = 

p(x2*  •  •  •  *xn)  +  r(x2’  *  *  *  ,xn)  f(lix2*  *  •  *  *  9.(x2*  * "  *  ,xn^  +  ^x2*  *  ’  ‘  ,xn^  * 
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We  deduce,  therefore,  that 


D  f(x, ,x 
*i 


2*' 


P(x2 . xn)  +  q(x2 . xn)  +  r(x2 


Thus  D^f  is  produced  by  replacing  the  literals  and  x^  by  1  in  the  original 
SOP  formula  for  f  H 


We  refer  to  the  foregoing  procedure,  which  was  given  first  apparently 
by  Mitchell  C3l  ,  as  the  "replace-by-1  trick." 

Example  4.  Let  f(w,x,y,z)  be  given  by 

f  =  wxyz  +  wxyz  +  wyz  . 

Then 

D  f  =  yz  +  yz  +  yz  =  y  z 

D  f  =  wx  +  wx  +  w  =1. 
yz 

Example  5.  The  following  (correct)  calculations  illustrate  potential 
pitfalls  in  applying  the  replace-by-1  trick i 

(a)  0u(u  +  vw)  =  1  +  vw  =  1 

(b)  Du(u  +  v)  =  Du(uv)  =  v 

(c)  Du(u  +  v)(u  +  w)  =  Dy(uw  +  uv  +  vw)  =  w  +  v 

Calculation  (a)  illustrates  that  D^f  is  not  found  simply  by  deleting  u  and  u 
(which  would  produce  vw  rather  than  1  in  this  case  ) ,  but  by  replacing  u  and  u 
by  1.  Calculations  (b)  and  (c)  illustrate  the  necessity  that  f  be  expressed  in 
sum- of- products  form  before  the  literals  u  and  u  axe  replaced  by  1.  If  the  re¬ 
placements  are  made  in  the  original  formulas,  then  the  erroneous  results  would 
be  £*u(u  +  v)  =  (l  +  v)  =  0  for  (b)  and  Du(u  +  v)(u+w)  =  (l+v)(l+v)  =  1  for 
(c). 
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Elimination 


A  Boolean  relation  constrains  the  vector  x  =  (x^ , . . . .x^)  to  lie  within 
a  subset  of  Bn.  It  also  constrains  any  k-element  subvector  of  x  (1  $  k  |  n)  to 
lie  within  a  subset  of  B^.  The  Boolean  relation  (l4)  describing  an  ARD-gate, 
for  example,  constrains  (U,Y,W)  to  lie  within  the  subset  {(0,0,0),  (0,1,0), 
(1,0,0),  (l,l,l)}  of  the  8-element  set  {o,l}^.  Suppose  we  wish  to  find  the  im¬ 
plied  relation  on  the  subvector  (U,W) .  To  do  so,  we  simply  delete  the  middle 
element  of  each  triple  in  (14),  and  keep  the  set  of  pairs  that  remains.  The 
resulting  relation  on  (U,W)  is 

(U,W)  €  {(0,0),  (1,0),  (1,1)}.  '  (20) 

We  say  in  this  case  that  V  has  been  eliminated  from  the  relation  (l4) 
to  produce  the  relation  (20),  and  we  call  (20)  the  resultant  of  elimination  of 
V  from  (l4).  Relation  (20),  it  should  be  emphasized,  limits  (U,W)  to  the  same 
subset  of  {0,1}  as  does  the  original  relation  (14). 

If  R  is  a  Boolean  relation,  i.e.,  one  equivalent  to  a  Boolean  equation 

f ( ,X£ , . • . , Xfl)  =  C,  (2l) 

then  the  resultant  of  elimination  of  any  argument  from  R  is  also  a  Boolean  re¬ 
lation.  Thus,  the  resultant  of  elimination  of  from  the  equation  (21)  may  be 
expressed  by  an  equation  of  the  form 


To  determine  the  resultant  (22)  from  equation  (21),  we  may  proceed  by  (i)  ex¬ 
pressing  (21)  as  an  equivalent  explicit  subset  of  Bn,  (ii)  deleting  the  first 
element  of  each  n- tuple  in  the  subset,  and  (iii)  expressing  the  resulting  sub¬ 
set  of  Bn_1  as  an  equation  of  the  form  (22) .  The  following  result  enables  us, 
however,  to  generate  (22)  directly  from  (21). 


Theorem  T .  The  equation  k(x„ . x  )  =  0  ex-presses  the  resultant  of 

*  c  n 

elimination  of  x^  from  the  equation  f(x^,x2 . x^)  =0  if  and  only  if  the 

identity 

g  =  Cf  (23) 

X1 

is  fulfilled. 


Proof:  The  fundamental  theorem  of  Boolean  algebra,  together  with  properties 
(l)  through  (3) ,  gives  rise  to  the  following  chain  of  equivalences: 

ftXi.Xg,..  .,xn)  =  0 

\ 

Xlf(°lX2 . xn}  +  xif(l'X2 . Xn}  =  0 


t 


x1f(0,x2,...,xn)  =  0 


. *r)  =  0 


t 


f(°>x2 . Xn^  $  X1 

Xx  $  ?(i,x2,...,xr) 

* . 

f(0,x2,...,xn)  <xx$  f(l,x2,...,xn)  . 

The  constraint  imposed  by  (10)  on  the  subvector  (x?,...,x  )  is 

c  n 


(24) 


f ( 0 ,Xo * « « • ,x  )  ^  f(l ,x~ , . . . ,x  )  , 


2  6 


which  may  be  re-expressed  as 

f(0,x2,...,xn)  •  f(l,x2 . xn)  =0. 

Thus  g  =  0  is  the  resultant  of  elimination  of  x^  from  f  =  0  if  and  only  if  the 
condition  (23)  is  satisfied 

Corollary  3 .1 .  The  equation  G(x2,...,xn)  =1  expresses  the  resultant  of 
elimination  of  x^  from  the  equation  F(x^,x,,,  . . .  .x^)  = 1  if  and  only  if  the 
identity 

G  =  D  F  (25) 

X1 

is  fulfilled. 

Example  6 .  The  AND-gate  discussed  in  Example  2  is  characterized  by  either 
of  the  equations  f(U,V,W)  =0  or  F(U,V,W)  =1,  where  the  functions  f  and  F 
are  defined  by 

f  =  UW  +  ?W  +  UVfl  (26)  ' 

F  =  UW  +  VW  +  UVW  . 

Applying  Theorem  1  and  its  corollary,  the  resultant  of  elimination  of  V  is  ex¬ 
pressed  by  either  of  the  equations  g(U,W)  =0  or  l(U,W)  =1,  where 

g  =  Cyf  =  f(u,Q,W)  •  f(U,l,V) 

=  (0W  +  W)  •  (UW  +  Uff) 

=  UW 

G=DyF  =  F(U,0,W)  +  F(U,1,W) 

=  (Off  +  ff)  +  (Off  +  UW) 

=  U  +  5  . 
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The  resultant  is  expressed,  therefore,  either  by  UW  =  0  or  by  U  +  M  =  1;  either 
of  these  equations,  or  the  equivalent  inclusion  W  <  U,  is  equivalent  to  the 
relation  (20).  These  relations  express  all  that  is  known  concerning  U  and  W, 
in  the  absence  of  knowledge  concerning  V. 

If  we  eliminate  W  from  (12),  the  resultant  is  (UV)(U  +  V)  =0,  i.e., 

0  =  0.  The  latter  relation  on  (U,V)  is  an  identity;  it  allows  (U,V)  to  be  cho- 
sen  freely,  that  is,  from  C 0 , l}  — which  confirms  our  expectation  that  the 
inputs  to  an  AND-gate  should  be  unconstrained  in  value  if  nothing  is  known 
concerning  the  value  of  the  output. 

The  Extended  Verification  Theorem 

Me  discuss  in  this  section  a  result,  due  to  LBwenheim  and  Hliller 
[4]],  which  enables  an  implication  between  two  Boolean  equations  to  be  trans¬ 
lated  into  an  equivalent  Boolean  inclusion.  The  presentation  in  this  section 
is  adapted  from  that  of  Rudeanu  [6]. 

Let  s  be  a  single  element  of  B  and  let  v  =  (v^.v^, . . . ,v^)  be  a  vector 
on  B,  i.e. ,  s  «  B  and  v  6  Bn.  Then  s v  and  v s  axe  defined  by 

sv  =  vs  =  (sv1,sv2,...,svn)  . 

Lemma  1.  Let  fs  Bn-^  B  be  a  Boolean  function  and  let  b  be  an  element 
of  Bn  such  that  f(b)  =  0.  Then 

f(bf(x)  +x?(x))  =0  V  x  €  Bn  .  (27) 

Proof;  By  the  fundamental  theorem  of  Boolean  algebra, 

f(bf(x)  +  x?(x))  =  f(x)  f(x)  +  f(x)  f(b)  . 

Each  term  on  the  right-hand  side  of  the  foregoing  equation  has  the  value  zero, 
for  any  x  €  Bn,  proving  (2?).  H 
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i 

Theorem  4  (Extended  Verification  Theorem) .  Let  f:  Bn  -♦B  and  g:  Bn  B 
be  Boolean  functions,  and  assume  that  the  equation  f(x)  =0  is  consistent.  Then 
the  following  statements  are  equivalents 

(  i)  (V  x  €  Bn)  [f (x)  =  0  =>  g( x)  =  0] 

(ii)  (Vx*  Bn)  [g(x)  i  f(x)] 

(iii)  (Vx  6{0,l}n)  [g(x)  *f(x)]  . 

Proof: 

(i)  s^(ii)s  Let  b  €  Bn  be  a  solution  of  f(x)  =0,  i.e.,  let  f(b)  =0. 

Then  g(b)  =0.  For  any  x  e  B11,  f ( x  f (x)  +  bf(x))  -  0  by  Lemma  1;  hence, 
g(x?(x)  +  bf(x))  =  0.  Thus,  for  all  x  gB“,  f(x)  g(x)  +  f(x)  g(b)  = 
f(x)  g(x)  ~  0,  i.e.,  g(x)  ^  f (x) ,  proving  (ii). 

(ii)  =>  (iii):  Trivial. 

(iii)  =s3>(i):  The  functions  f  and  g  are  Boolean;  hence,  they  may  be 

a!*- 1  i”*i 

written  in  minterm  canonical  form,  i.e.,  f(x)  =  2f,  m  (x)  and  g(x)  =  Jg,  m  (x) 

”  1*0  1  1  "  “  iso  1  1 

for  all  x  C  Bn.  Assume  (ill),  i.e.,  assume  that  g^  £  f,  (i =0,1, . . . ,2n-l) ,  and 

***-1 

let  b  €  B  be  a  solution  of  f(x)  =0.  Then  =  °>  which  implies  rhat 

iso 

fimi(b)  =  0,  and  therefore  that  g^m^(b)  =  0#  ( i  =  0 ,1,  . . .  ,2n-l)  .  Thus  g(b)  =0, 
proving  (i) .  ■ 

Corollary  4.1.  Let  f:  Bn-*  B  and  g:  Bn—»B  be  Boolean  functions  and 
assume  that  the  equation  f(x)  =  0  is  consistent.  Then  the  following  statements 
are  equivalent: 

(1)  (V  X  e  Bn)  [f(x)  =  0  g(x)  =  0] 

(ii)  (Vx«  Bn)  [f(x)  =  g(x)] 

(iii)  (Vxe  Co,i}n)  Cf(x)  =  g(x)]  . 


Proof:  Immediate  from  Theorem  3  and  the  definition  of  equivalent  systems.  ■ 


Poretsky ' s  Lav  of  Forms 

It  is  useful  on  some  occasions  to  re- express  the  information  supplied 
by  the  Boolean  equation  f (x)  =  0  in  the  equivalent  form  g(x)  =  h(x) ,  where  g  is 
any  given  Boolean  function.  The  associated  Boolean  function  h  is  specified  by 
the  following  theorem. 

Theorem  jj  (Poretsky's  Law  of  Forms).  Let  f,g,hs  Bn— ^B  be  Boolean 
functions  and  suppose  the  equation  f(x)  =  0  to  be  consistent.  Then  the  equiva¬ 
lence 

f(x)  =  0  <-->  g(x)  =  h(x)  (2S) 

holds  for  all  x  €  Bn  if  and  only  if 

h  =  f  ©g  .  (29) 

Proofi  Suppose  (28)  to  hold  for  all  X  in  Bn.  Then  (28)  is  equivalent,  by 
property  (2)  and  Corollary  3-1.  to  the  equation  f(x)  =  g(x)©h(x)  (Yxft  Bn) 
Thus  g(x)©  f(x)  =  g(x)®  (g(x)©h(x))  =  h(x)  (Vx  ft  Bn)  ,  from  which  we  de¬ 
duce  (29)  directly.  Suppose  on  the  other  hand  that  the  function  h  is  defined 
by  (29) .  Let  b  €  Bn  be  one  of  the  solutions  of  the  consistent  equation  f(x)  =C 
Then  h(b)  =  f (b)  ©  g(b)  =  0  ©g(b)  =  g(b)  ,  i.e.,  b  is  also  a  solution  of 
g(x)  =  h(x)  (and  we  deduce  that  g(x)  =  h(x)  is  consistent) .  Thus  f(x)  =  0  ss^ 
g(x)  =  h(x)  .  To  show  that  g(x)  =  h(x)  f (x)  =  0 ,  let  c  ft  Bn  be  any  solution 
of  g(x)  =  h(x)  .  Then  g( c)  ©  h( c)  =  0,  whence  g(o)®  (f(o)  ®g(c))  =  0  by  (29), 
from  which  we  deduce  that  f(c)  =  0,  proving  (28).  ■ 

Example  7.  Suppose  a  Eoolean  function  h  is  sought  having  the  property 
that  the  equation  x^Xg  +  x^  =  0  is  equivalent  to  XgX^  =  h .  The  first  equation 
is  consistent  (a  solution,  for  example,  is  x^  =  0,  Xg  =  0,  x^=0)j  hence,  h  is 
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determined  uniquely  by  (29),  i.e., 


h  =  (xix2  +  x3^ ®  ^X2X3^ 

*  +  x3)  • 
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V.  FUNCTIONAL  DEDUCTION 


An  important  potential  application  of  the  inferential  processor  is  that 
of  generating  functional  consequences,  i.e.,  conclusions  of  the  form 

^  =  f(x2, . . . ,xn) ,  (30) 

from  a  given  system  of  logical  equations  on  the  variables  x^iXg, If  such 
consequences  exist,  then  we  call  x^  functionally  deducible  from  the  given 
equations  and  we  say  that  fx2 , . . . .x^}  is  a  determining  subset  for  x^.  Generat¬ 
ing  functional  consequences  from  a  given  system  of  equations  is  the  inverse  of 
solving  the  system;  if  (30)  is  a  solution  of  a  system,  then  the  system  is  a 
consequence  of  (30)  .  The  problem  of  solving  logical  equations  was  given  pri¬ 
mary  attention  in  Boole's  original  work,  and  has  since  been  studied  inten¬ 
sively;  there  has  been  no  progress  to  our  knowledge,  however,  on  the  problem 
of  generating  functional  consequences. 

Some  very  preliminary  work  on  functional  deduction  was  reported  in  [lj; 
we  outline  in  this  section  the  progress  we  have  made  in  the  meantime.  A  test 
(Theorem  6)  is  given  to  determine,  for  a  given  logical  database,  the  function¬ 
ally  deducible  variables;  this  test  is  well-suited  for  high-speed  execution  by 
the  inferential  processor,  inasmuch  as  it  is  based  on  the  basic  units  of  data 
(prime  implicants)  stored  in  the  processor.  Given  that  a  variable  is  function¬ 
ally  deducible,  the  set  of  functions  f  for  which  (30)  Is  a  functional  conse¬ 
quence  is  specified  by  Corollary  6.1.  A  necessary  and  sufficient  condition  for 
a  subset  of  [x2,...,xn]  to  be  an  x^-determining  subset  is  given  in  Theorem  7, 
and  an  algorithm  is  given  to  generate  the  class  of  minimal  x^- determining  sub¬ 
sets.  Finally,  the  theory  of  functional  deduction  is  applied  to  the  problem  of 
designing  economica],  multiple-output  combinational  circuits. 
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The  discussion  in  this  section  is  based  on  the  concepts  and  terminology 
introduced  in  Section  II. 

Functional  Deducibility 

Let  us  suppose  a  collection  of  Boolean,  i.e.,  propositional,  data  to  be 
reduced  by  the  inferential  processor  to  the  single  equation 

0(xltx2 . xn)  =  1  .  (31) 

We  say  that  x^  is  functionally  deducible  from  (31)  in  case  there  is  a  Boolean 
X 

function  such  that  equation  (30)  is  a  consequence  of  (31)  ■  We  call 

(30)  a  functional  consequence  of  (31) . 

Theorem  6,  The  following  statements  are  equivalent: 

(i)  x^  is  functionally  deducible  from  ^(x^, . , . ,xn)  =1. 

(ii)  D  ?  «  1. 

X1 

(iii)  c  4  =  0. 

1 

(iv)  x^  or  x^  appears  in  every  term  of  BCF(jf)  . 

Proof: 

( i)^-5*  ( ii)^-£r(  iii)  :  The  equivalence  of  the  following  statements  follows  di¬ 
rectly  from  the  results  of  Section  II.  In  particular,  the  equivalence 
of  (a)  and  (b)  follows  from  the  extended  verification  theorem  (Theorem 
4)  and  property  (2) . 

(a)  (3f)  . . .  ,xn)  =1  =>  x1  =  f(x2,...,xn)] 

(b)  (3f)  [sK^.Xg,  •  •  •)  $  xx©  f(x2, 

(c)  (3f) 


^(0,x2> . . .  )  ^  f(x2» . . .  ) 
^(l,x2,...  )  $  f(x2,...  ) 
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any  Boolean  function  in  the  non-empty  interval 


VY  *  f  S  W  ■  (35) 

Proof!  By  Theorem  6,  the  variable  u  is  deducible  from  = 1  if  and  only  if 
Cu(Dw#)  =  0,  i.e.,  (Dw^)u' (Dy5f)-  =  C.  From  the  identities 

(V)u  =  Dw(^u)  (36a) 

(V>5*diW'  (36b) 

we  conclude  that  u  is  deducible  from  D ^  =  1  if  and  only  if  (3^)  is  satisfied, 
in  which  case,  by  Corollary  6.1,  we  obtain  the  equation  u  = f ( v)  as  a  conse¬ 
quence,  where  (D„0)  $  f  $  (D.,^)-,  Identities  (36a)  and  (36b)  lead  therefore 

n  U  m  U 

to  (35)  • 


Generating  minimal  determining  subsets.  The  following  procedure,  based 
on  Theorem  ?,  generates  a  convenient  representation  of  the  class  of  minimal 
u-determinlng  subsets. 


Step  1.  Express  and  as  sum- of- products  (disjunctive  normal)  for¬ 
mulas,  viz., 

m 


K  =  ?-  pi 

L=i 

M 


Step  2.  Associate  with  each  pair  (p^.q^)  of  terms  an  alterm  s^.  defined 
by  the  summation 


s^  =  2  (letters  that  appear  opposed  in  p^  and  q^) . 


Step  3.  Construct  the  product- of- sums  formula 


a  =  TT  TT  s 
U  in  iei  iJ 
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Step  4.  Multiply  out,  to  form  a  sum-of- products  formula  for  a^,  and  de¬ 
lete  assorted  terms. 

Step  5*  With  each  term  xy**-z  of  a^  associate  a  minimal  u-determining 
subset  £x,y , . . . ,z} . 

Example  8.  Let  us  examine  for  functional  deducibility  the  data  given  in 
a  problem  widely  quoted  by  early  logicians  (Boole  [2j,  Chapter  IX) : 

"Suppose  that  an  analysis  of  the  properties  of  a  particular  class  of 
substances  leads  to  the  following  statements: 

(1)  Whenever  properties  A  and  C  are  missing,  then  property  E  is  found, 
together  with  one  of  the  properties  B  and  D,  but  not  both. 

(2)  Whenever  the  properties  A  and  D  are  found  while  E  is  missing,  then 
both  B  and  C  will  either  both  be  found  or  both  be  missing. 

(3)  Whenever  property  A  is  found  in  conjunction  with  either  B  or  E,  or 
both  of  them,  then  C  or  D  will  also  be  found,  but  not  both  of  them. 
Conversely,  whenever  C  or  D  (but  not  both)  is  found,  then  A  will  be 
found  in  conjunction  with  either  B  or  E  or  both  of  them." 

The  foregoing  data  are  equivalent  to  the  single  equation 

S*  =  1,  (37) 

where  4  Is  given  in  Blake  canonical  form  by 

BCF(rf)  =  A  CD  +  A BCD  +  ASDE  +  ACSE  +  Asses  +  ABODE.  (38) 

Hie  variables  appearing  in  every  term  of  BCF(ff)  are  A,  C,  and  D;  hence,  by 
Theorem  6,  these  are  the  variables  functionally  deducible  from  (37) . 

Let  us  consider  the  functionally  deducible  variable  A;  in  particular, 


let  us  determine  the  minimal  A-determining  subsets  of  {B,C,D,E}. 


flf,  =  BCS  +  CDE  +  CDE  +  BCDE 

rA 

—  CD  +  B  CDE 

Thus, 

aA  =  (D)(C)(C)(D)(D)(C)(C  +  D)(B  +  E) 

=  GD(B  +  E) 

=  BCD  +  GDE . 

The  minimal  A-determining  subsets,  therefore,  are  £b,C,D}  and  {C,D,E}.  To  de¬ 
termine  f  in  the  functional  consequence  A=f(B,C,D),  we  apply  (35)  in  Theorem 
7,  viz., 

D£e}^a)  *  f  *  ECe}W- 

Thus, 

BCD  +  CD  +  CD  +  BCD  ^fiCD+BC  +  CD. 


Two  simplified  functional  consequences  axe  derived  from  the  foregoing  inter¬ 
val,  viz., 

A  =  CD  +  CD  +  BC 
A  =  CD  +  CD  +  BD. 

Similar  analysis  yields  the  following  functional  consequence  based  on  the 
A-determining  subset  V  =  £c,D,E}: 

a  =  CD  +  CD  +  D£. 

We  noted  earlier  that  the  variables  functionally  deducible  from  (37) >  in  ad¬ 
dition  to  A,  are  C  and  D.  The  (unique)  C-determining  and  D-determining  subsets 
are  £a,B,D,E}  and  £a,C,E},  respectively?  the  corresponding  functional  conse¬ 
quences,  in  simplified  form,  are 

C  3  SD  +  bE  -*  ADE 
D  =  AC  +  ACE  . 
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Circuit  Design  Based  on  Functional  Deduction 


An  n- input,  k-output  combinational  circuit  is  typically  specified  by  a 
system  of  equations  of  the  form 

Z1  =  fl^Xl . Xn^ 

:  (39) 

\  3  fk(xl,,,,,xn) 

0  *W*1 . xn>- 

The  latter  equation  represents  any  "don’t-care"  conditions  that  may  exist  on 
allowable  input- combinations. 

It  was  observed  as  early  as  1951  [ 3 ]  that  a  system  of  the  form 

21  =  gl^xl . xfl) 

z2  =  S2^X1’ * ' * ,Xn,Zl^ 

=  g3(x1,...,xn,z1,z2)  (40) 

• 

\  3  Sk(xl”'”Xn,Zl . Vl^ 

may  meet  the  functional  specifications  of  (39)  at  reduced  logical  cost.  Out¬ 
puts,  that  is,  may  be  used  to  assist  in  the  generation  of  other  outputs.  The 
recursive  structure  of  (40)  guarantees  that  the  resulting  circuit  is  free  of 
closed  loops.  There  are  cases,  e.g.,  the  end-around  carry  in  a  one ’ s- comple¬ 
ment  adder,  In  which  closed  loops  may  be  employed  with  good  effect  in  combi¬ 
national  design  [4,5,6].  Such  loops,  however,  introduce  the  possibility  of  os¬ 
cillations  and  other  problems  inherent  in  the  design  of  asynchronous  sequen¬ 
tial  circuits j  we  therefore  confine  our  attention  to  loop- free  specifications 
of  the  form  (4o) .  We  call  the  corresponding  realizations  recursive  circuits. 


38 


The  logical  cost  of  a  recursive  circuit  depends  on  which  outputs  are 
allowed  to  depend  on  which  other  outputs;  the  sequence  (1,2 . k)  specified 


by  (40)  is  only  one  of  kl  possible  sequences.  No  method  has  hitherto  been 
known  for  determining  a  promising  sequence  prior  to  working  out  the  actual 
functions  corresponding  to  that  sequence.  Recursive  circuits  have  consequently 
been  regarded  as  difficult  to  design,  even  though  their  potential  economy  has 
been  well-recognized;  the  design  of  such  circuits  is  stated  in  C?1  to  be 
"practicable  for  synthesizing  a  net  which  has  not  more  than  two  or  three  out¬ 
puts  .  ” 

Functional  deduction  provides  a  way  to  overcome  tne  foregoing  diffi¬ 
culties,  enabling  recursive  circuits  to  be  designed  conveniently.  The  follow¬ 
ing  procedure  is  based  on  minimizing  the  number  of  arguments  upon  which  the 
output- functions  depend. 

Step  1.  Reduce  the  original  specification  (39)  to  a  single  equation  of 
the  form  ^(x^  . . .  ,xn,z1 . z^)  =  1. 

Step  2.  Calculate  the  z^-determining  subsets  ( i = 1,2, . . . ,k) . 

Step  3.  Select  a  sequence  S  ,S .  ,...,S,  of  subsets  of  (xnl...,x  , 

h  ^2  \  In 

zl,,,',zk"  havlnS  the  following  properties: 

(a)  S^  is  a  zr-determining  subset  (r  =  l,...,k). 

(b)  S,  is  a  subset  of  {x.,...,x  }; 

S,  is  a  subset  of  £x.,...,x  ,z,  }; 

-A*  n 

S  is  a  subset  of  £x.,...,x  .z,  ,z,  };  etc. 

*■0  J.  n  lg 

(c)  The  subsets  S  ,S  ,  ...,S,  are  as  small  as  possible. 

1i  H  \ 

Step  4.  Construct  simplified  consequences  of  the  form  z^  = 

(r*l,...,k),  where  the  arguments  of  gr  are  those  appearing  in 


Example  A  multiple- output  circuit  is  specified  by  the  equations 


z^  =  a  +  be 
z2  =  ab  +  c 
z^  3  a  +  b  +  c. 

Let  us  apply  the  procedure  given  on  the  previous  page,  with  the  object  of  re¬ 
ducing  the  logical  cost  of  the  foregoing  specifications. 


Step  Is  $  - 


3  acz-jZ^ 


acz^z,^ 


+  abz^ZgZ^  +  abcz^ZgZ^  +  abez^z^z^ 


Step  2:  Calculation  of  z ^-determining  subsets: 

$  3  acz-z-  +  acz„z_  +  abci-z,, 

z^  2  3  23  23 

0-  =  abz_z~  +  abez^z- 

z^  2  3  2  3 

az^  =  (a) (a)(a  +  z2)(a  +  c  ♦  zg)(b  +  z2  +  z^Kc  +  z2  +  z^) 
3  abc  +  az2  +  az^ . 


Similarly, 

a  3  abc  +  cz,  +  acz„ 

z2  1  3 

a  3  abc  +  az,  +  az„ . 

-L.  <c 

Step  3 1  Two  subset- sequences  are  promising: 
Sequence  #  1 
S1  *  {a,b,c] 

32  *  Cc.z^ 

S^  =  Ca.z^)  or  [a,z23 


Sequence  #2 
S^  =  (a,b, c} 
Sx  =  £a,Zj} 

S2  =  Cc.b^} 
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Step  4:  Simplified  functional  consequences: 

Sequence  # 1 


Sequence  #2 


z^  =  a  +  be 

z2  =  a  + 

z3  =  £  + 


=  a  +  b  +  c 

z]_  =  a  +  z3 

z2  =  3  +  Z1 


Either  of  the  foregoing  realizations  is  more  economical  than 
a  direct  realization  of  the  original  specifications.  Each 
requires  a  single  IG  package,  sequence  #1  a  quad  2- input 
NAND  and  sequence  #2  a  triple  3- input  NAND. 


Example  10.  The  input- logic  for  a  clocked  D-latch  is  defined  by  the 
equations 

U  =  C  +  D  (41a) 

V  «  0  +  D,  (41b) 

where  U  and  V  are  excitation-signals  for  a  NAND- latch,  C  is  a  clock- input,  and 
D  is  a  data- input.  A  circuit  implementing  (4la)  requires  a  single  KAHD-gatej 
however,  (41b)  requires  an  inverter  in  addition  to  a  NAND-gate.  To  simplify 
(41b),  we  resort  to  functional  deduction.  The  system  (4l)  is  equivalent  to  the 
single  equation  $  -  1,  where 

BCF {(f)  -  GUV  +  GDUV  +  CDUV.  (42) 

We  deduce  from  (42)  that  C,  U,  and  V  are  functionally  deducible  from  $  -  1. 

The  corresponding  determining  subsets  are  represented  by  the  functions  a^,  a^, 
and  a^: 

ac  =UV 
Sy  =  CD  +  CV 
ay  =  CD  +  CU. 
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The  function  implies  that  £ G , U}  is  a  '/-determining  subset.  The  cor¬ 
responding  functional  consequence  is  specified  by  (35)  as  follows: 


i.e. , 


$  V  $  DD(^}  1 

cu  +  cu$v$c  +  u. 


A  simplified  functional  consequence  specifying  V,  therefore,  is 

V  a  C  +  U. 


A  circuit  implementing  the  latter  relation  requires  only  a  single  NAND-gate. 


Example  11.  Let  us  suppose  that  we  are  to  design  an  asynchronous  se¬ 
quential  circuit  having  inputs  a^  and  aQ  and  output  z.  The  output  is  to  have 
the  value  1  if  and  only  if  the  present  value  of  the  binary  number  is 

greater  than  the  preceding  value.  We  assume  that  the  signals  a^  and  aQ  cannot 

change  simultaneously. 

By  standard  processes  of  asynchronous- circuit  design  we  arrive  at  the 

specification 

y  =  n»a.5(a1,a0,y) 
z  =  maj(alfa0,y) , 

where  y  is  an  internal  state- variable  and  where  the  "majority"  function  maj  is 
defined  by  maj(x,y,z)  =xy  +  xz  +  yz.  The  foregoing  specifications  sure  best  im¬ 
plemented  by  full  adders  (FA • s) ,  which  generate  mr iority- functions  at  their 
carry- outputs?  the  resulting  circuit  is  shown  in  Figure  1. 

The  circuit  of  Figure  1  requires  two  packages,  a  dual  full  adder  and  a 
hex  inverter.  Only  one-sixth  of  the  inverter- package  is  employed,  however,  and 
the  upper  full  adder  provides  a  sum-output, 

s  =  a1®  aQ®  y, 
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Fig.  1.  Asynchronous  circuit — original  design. 


which  is  not  employed  at  all.  These  observations  lead  us  to  apply  functional 
deduction  to  the  expanded  system 

y  =  maj(a1,a0,y) 
z  =  majCa^a^y) 
s  =  aQ®  y. 

The  foregoing  specifications  are  equivalent  to  the  single  equation  /=1,  where 
4  =  a-ji-glyz  +  (a^Q  +  a^Xsyz  +  iyi)  +  a^syz. 

Thus 

4Z  =  Wy  +  alVy  +  ala0sy 
h  =  W*  +  aiaoiy  +  Voiy’ 

whence 

az  =  (ao  +  s)(s  +  y)(ai  +  aQ  +  s  +  y)(a1  +  s)(ax  +  aQ  +  s  +  y) 
(s  +  y)(a1  +  aQ  +  s  +  y)(a;L  +  s)(aQ  +  s) , 

if f 

a_  3  a.a_y  +  s. 

2  10 
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A  result  (surprising  to  this  investigator)  of  the  function  is  that  one  of 
the  z-determining  subsets  is  £s}.  The  corresponding  z-consequence  is  specified 
by  the  interval 


DCa1(a0,y}^z^  ^  z  <  ^a^a^y}^ 


i  «  e « , 


Thus,  z  is  given  by 


«  z  s  s 


z  =  s  . 


The  corresponding  circuit,  shown  in  Figure  2,  requires  only  one-half  of  a  dual 
full- adder  package . 
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INFERENTIAL  ANALYSIS  OF  RELATIONAL  DATABASES 


Donald  Keith  Taylor 


FOREWORD 


Database-processing  is  an  important  potential  application  of 
the  proposed  inferential  processor.  We  show  in  this  study  that 
propositional  deduction  may  be  used  to  determine  the  functional 
dependencies  in  a  relational  database,  from  which  (as  is  well- 
known)  the  keys  for  the  database  may  be  determined.  In 
particular,  we  have  developed  and  programmed  a  two-part  algo¬ 
rithm:  the  first  part  generates  the  functional  dependencies  of 
the  relation;  the  second  part  uses  these  dependencies,  together 
with  rules  for  propositional  inference,  to  generate  the  keys  of 
the  relation.  The  algorithm  is  programmed  in  the  logical  language 
PROLOG. 
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CHAPTER  I 


INTRODUCTION 

A  major  problem  in  the  design  and  use  of 
computers  is  that  of  storing ,  retrieving,  and  updating 
large  quantities  of  non-numerical  data.  This  problem 
is  usually  managed  by  storing  these  data  in  a 
database.  Several  types  of  databases  exist;  however, 
the  relational  database  has  the  simplest  and  most 
regular  structure.  This  structure  makes  the 
relational  database  attractive  for  use  in  large,  high¬ 
speed  data  retrieval  systems  employing  associative 
memories  and  parallel  processors. 

The  relational  model  is  based  on  the  idea  that  a 
database  containing  information  about  a  particular 
object  (e.g.,  a  university  class-schedule)  can  be 
viewed  as  a  relation  on  a  set  of  attributes;  the 
attributes  for  a  class-schedule  would  be  the  course 
number,  the  room  number,  the  professor's  name,  and  so 
on.  The  data  of  the  relational  database  are  stored  in 
a  simple  tabular  form,  one  row  for  each  record,  and 
one  column  for  each  attribute. 

The  data  in  each  row  of  the  table  are  accessed  by 
using  a  key  of  the  database.  A  key  in  a  relational 
database  is  a  subset  of  its  attributes  which  "unlocks" 
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the  database:  if  the  value  of  each  attribute  in  a  key 
is  specified,  a  unique  row  of  the  table  can  be 
specified.  The  keys  of  a  relational  database  are 
sometimes  very  difficult  to  locate;  however, 
examination  of  the  functional  dependencies  inherent  in 
a  database  will  generate  the  desired  keys. 

The  functional  dependencies  of  a  database  have 
many  uses  in  modern  database  theory.  However,  no 
clearly  defined  generation  method  for  these 
dependencies  has  been  developed.  Using  the  recently 
proven  fact  that  propositional  (Boolean)  logic  can  be 
used  to  characterize  the  functional  dependencies 
inherent  in  a  relational  database,  an  algorithmic 
procedure  to  generate  these  dependencies  will  be 
derived.  By  applying  Boolean  analysis  to  these 
dependencies,  an  algorithm  will  be  developed  to 
determine  the  keys  of  relational  database.  The  two 
preceding  algorithms  will  be  joined  together  to  form 
the  FD-Key  algorithm.  The  FD-Key  algorithm  has  the 
capability  to  generate  the  functional  dependencies  of 
a  relational  database;  using  these  dependencies,  the 
keys  of  the  database  may  be  located.  To  demonstrate 
the  feasibility  of  the  FD-Key  algorithm,  the  logic 
programming  language,  PROLOG,  will  be  used  to  generate 
the  functional  dependencies  and  keys  of  a  given 
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relational  database. 

In  Chapter  IZr  the  basic  concepts  of  relational 
databases  are  discussed,  emphasizing  the  terms, 
components,  and  properties  of  such  databases.  Also, 
some  associated  problems  of  utilizing  relational 
databases  are  explored. 

Chapter  III  presents  some  fundamental  rules  and 
properties  of  propositional  logic  and  Boolean 
analysis.  Also,  the  equivalence  of  propositional 
logic  and  relational  databases  is  discussed.  An 
algorithm  to  generate  the  keys  of  a  database  from  its 
functional  dependencies  is  developed.  This  algorithm 
is  later  used  as  one  of  the  main  components  of  the  FD- 
Key  algorithm. 

In  Chapter  IV,  the  algorithm  to  generate  the 
functional  dependencies  from  a  relational  database  is 
developed.  As  an  example,  the  functional  dependencies 
and  keys  for  a. given  relation  are  derived. 

Chapter  V  discusses  the  basic  concepts  of  the 
logic  programming  language  PROLOG.  Using  these 
concepts,  Chapter  VI  presents  the  syntax  rules  and 
system  requirements  for  correct  implementation  of  the 
FD-Key  algorithm  developed  in  Chapters  III  and  IV. 

Suggestions  for  future  work  involving  the  FD-Key 


4 


algorithm  are  presented  in  Chapter  VII.  Chapter  VIII 
contains  a  brief  summary  of  the  work  and  conclusions 
presented  in  this  thesis.  The  flowcharts  of  the  FD- 
Key  algorithm  presented  in  this  thesis  are  contained 
in  Appendix  A.  Appendix  B  is  made  up  of  the  actual 
PROLOG  software  used  to  execute  the  FD-Key  algorithms. 
Finally,  Appendix  C  contains  executions  of  the  FD-Key 
algorithm  in  PROLOG  for  several  sample  relations. 


CHAPTER  II 


| 


INTRODUCTION  TO  DATABASES 

Batabagg  ttadaia 

A  typical  database  is  organized  into  three 
different  parts:  a  collection  of  interrelated  datar 
the  hardware  necessary  to  store  the  data,  and  the 
software  required  to  use  the  data  in  a  real-world 
application.  The  database  must  accurately  represent 
some  undertaking  in  the  real  world,  and  it  must  be  at 
the  user's  disposal.  The  currently  available  hardware 
imposes  a  structure  upon  the  data.  This  structure  is 
called  a  schema,  and  it  defines  the  data  model  used  in 
creating  the  database.  Each  model  is  given  a  name 
which  represents  the  way  data  are  viewed  by  the  users. 
The  three  currently  used  structures  are  the  network, 
hierarchy,  and  relational  models.  The  database 
systems  that  are  curently  in  existence  were  proposed 
and  studied  in  many  different  reports  by  several 
authors  [1 ,2 ,3 ,4,5 ,9 ,12 ,14 ,15 ,17 ,18 ,19 ,21 ,23 J . 

The  network  model  was  first  proposed  by  the 
Committee  on  Data  System  Language,  (CODASYL).  This 
model  consists  of  various  blocks  of  data  organized  in 
a  network.  The  access  time  for  some  blocks  of  data  is 
very  fast,  but  the  user  must  set  up  the  structure  of 


the  system,  which  cannot  be  altered  once  the  data 
have  been  stored. 

The  second  data  structure  is  the  hierarchical  data 
model.  Here,  data  blocks  with  similar  characteristics 
are  accessed  by  a  common  data  path.  Hence,  access 
time  between  data  blocks  with  similar  information  is 
very  small,  but  access  time  between  blocks  with  very 
dissimilar  data  can  be  very  large. 

The  third  data  structure  is  the  relational  model 
developed  by  E.  F.  Codd  [10],  In  a  relational 
database,  the  data  are  normalized  into  a  form  where 
the  relationships  among  data  items  appear  in  a  two- 
dimensional  tabular  form.  Most  users  have  very  little 
trouble  in  understanding  this  data  model  since  the 
two-dimensional  table  is  a  familiar  method  of 
conveying  information.  This  thesis  will  use  the 
relational  data  model  exclusively. 

ftciatipnai  Batafaasaa 

The  previous  discussion  presented  some  general 
concepts  of  data  models,  but  to  fully  understand 
relational  databases,  the  accepted  conventions, 
properties,  and  formal  definitions  of  a  relational 
database  must  be  explained.  Henceforth,  the  use  of  the 
word  "database"  will  refer  to  a  relational  database. 
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Zn  a  database,  the  two-dimensional  table  is  called 
a  -reia-t-ion.  The  columns  of  the  relation  are  labeled 
with  unique  names  called  at  triton-tea,  and  the  rows  are 
called  tuples.  The  data  values  in  the  relation  are 
chosen  from  several  sets  of  values  called  domains. 
Each  attribute  has  a  domain  and  several  attributes  may 
share  the  same  domain.  For  example,  if  a  relation  has 
two  attributes,  say  part  number  and  serial  number,  the 
attributes  are  different,  but  their  domains  could  be 
the  same  set  of  numbers.  A  more  formal  definition  of 
a  relation  is  now  given,  since  some  of  the  basic 
terminology  has  been  discussed. 

Befinition.  Given  a  set  of  domains  Dlr  D2 ,...,  and 
Dn,  R  is  a  relation  on  these  n  sets  if  it  is  a 
collection  of  n-tuples,  <dx  ,d2 ,...  ,dn>,  such  that  di 
is  an  element  of  D].,...,and  dn  is  an  element  of  Dn* 

The  usual  method  of  representing  attributes  in  a 
relation  is  to  allow  letters  near  the  beginning  of  the 
alphabet  to  stand  for  individual  attributes,  and 
letters  near  the  end  of  the  alphabet  to  stand  for 
sets  of  attributes.  For  example,  C  could  represent 
the  attribute  COURSE  in  Fig.  1,  and  X  could  represent 
the  set  of  attributes  {NAME,  COURSE,  TINE,  ROOM 
NUMBER}.  The  union  of  two  sets  of  attributes,  X  and 
Y,  is  denoted  by  the  concatenation  XY,  and  ABC 
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represents  the  set  of  attributes  {A,B,C}.  The  relation 
R  on  the  set  of  Attributes  X  in  Pig.  1  is  written  as 
R(X).  If  X  is  broken  into  two  sets,  Y-{COURSE,  TIME} 
and  Z-{NAME,  ROOM  NUMBER}  where  X«YZ,  then  R(X)  is  the 
sane  as  R(Y,Z). 


1  NAME 

1  COURSE 

I 

TIME 

i  ROOM  NUMBER 

1 

I  Green 

1 

(Psychology 

8:00 

i  ii2 

! 

1  Green 

1  Psychology 

10:00 

1  112 

1 

1  Stewart 

IChemistry 

2:00 

i  106 

1 

I  Stewart 

jchemistry 

8:00 

I  104 

1 

1  Jones 

(Mathematics 

12:00 

1  210 

1 

1  Smi th 

I  Psychology 

9:00 

1  104 

1 

1  Johnson 

1  Physics 
■1 - 

9:00 

[  210 

1 

Pig.  1.  Relation  R(X)  . 

Bafcafaaflfi  flactnflsnsiga.  In  a  database,  several 
relationships  exist  among  the  attributes.  One  of  the 
main  relationships  is  that  of  functional  dependency 
(FD).  Before  dependencies  can  be  discussed,  the 
representation  of  a  data  value  in  a  tuple  must  be 
explained.  Let  r  be  a  tuple  in  the  relation  R(X)  on 
the  set  of  attributes  X,  where  the  set  of  attributes  Y 
is  contained  in  X.  The  tuple  of  values  of  r  for  the 

set  of  attributes  Y  is  denoted  by  r[Y]. 

» 

PEFffliyteM.  Given  a  relation  R,  two  sets  of 
attributes  X  and  Y,  the  functional  dependency,  x  ->  Y, 
holds  in  R,  (or  relation  R  satisfies  X  ->  Y) ,  if  and 
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only  if  for  any  two  tuples  v  and  w  in  R,  v[X]  *  w[X] 
implies  v[Y]  *  w[Y]. 

BSgigiSaefl.  A  dependency  s  is  a  gnnasgaanss  of  a 
set  of  dependencies  S  if  for  all  relations  R,  s  holds 
in  R  if  all  the  dependencies  of  S  hold  in  R. 

Functional  dependencies  are  used  extensively  in 
designing  relations  that  are  free  from  data  storage 
and  retrieval  errors.  These  errors  are  called 
insertion,  dsiagian ,  and  ifisiiiins  anamaiias.  The 
insertion  anomaly  is  the  use  of  undefined  or  null 
values  in  the  table  of  a  relation.  The  removal  of  a 
part  of  a  tuple,  causing  the  loss  of  other 
information,  is  called  a  deletion  anomaly.  The 
rewriting  anomaly  can  easily  be  explained  by  the 
following  example.  Suppose  the  functional  dependency 
A->B  holds  in  the  relation  R(X)  ,  and  there  exist 
tuples  t^  *  <a^,bi,c^>  and  t2  *  <ai,bi,C2>  in  R(X). 
Then  if  t^  is  changed  to  <ai,b2,c^>,  the  tuple  t2  must 
rewritten  as  <a^,b2,C2>.  If  t2  is  not  changed,  an 
anomaly  will  exist  in  the  relation  since  the 
dependency  A->B  will  no  longer  hold. 

Batafeaaa  aafaenatn  and  fcssa*  a  relation  agfcaaa  is  a 
description  of  a  single  relation  consisting  of  the 
relation  name,  a  set  of  attributes,  and  a  set  of 
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dependencies.  The  state  (instance  or  extension)  of  a 
relation  schema  is  simply  a  table  of  data  that 
conforms  to  the  set  of  dependencies  and  uses  the 
attributes  contained  in  the  relation  schema.  A 
database  schema.  £,  is  the  set  of  relation  schemata  in 
the  database.  The  state  of  a  database,  D,  is  a  mapping 
of  relation  states  to  the  schemata  of  &. 

The  concept  of  a  set  of  key  attributes  (or  simply 
a  key)  existing  in  a  database  is  vital  to  the 
retrieval  of  information  stored  in  a  database.  Once  a 
key  has  been  located,  any  other  information  stored  in 
the  database  can  be  accessed. 

BEFiNiTTOfl.  A  subset  Y  of  X  is  a  key  for  R(X)  if 
and  only  if  Y~>X  and  there  is  no  proper  subset  Z  of  Y 
such  that  Z->X. 

In  other  words,  a  key  of  a  relational  database  is 
a  subset  of  its  attributes  that  "unlocks"  the 
information  stored  in  the  database:  if  the  data  values 
for  a  key  are  specified,  a  unique  row  of  the  table  is 
identified. 

A  notion  of  a  aapexkey  is  closely  related  to  the 
notion  of  key.  A  snpeTkey  is  a  set  of  attributes 
containing  a  key  as  a  subset  .  Consider  the  relation 
R(X)  shown  in  Fig.  2,  on  the  set  of  attributes  X  * 

{ A, B,C ,D } .  The  set  Z  ■  { A,D }  is  a  key  of  R(X);  thus 


one  of  the  superkeys  of  R(X)  is  the  set  Y  =  {A,B,D}. 


1  A 

1  B 

1  c 

1  _  _ 

D  I 

!  ai 

j  bi 

!  ex 

dl  1 

a2 

1  &2 

■  c2 

<Jl 

al 

! 

C1 

d2  ! 

|  a2 

b3 

! . C1 

d2 

Fig.  2.  Relation  R(X) . 

An  important  but  difficult  task  to  be  completed 
before  a  database  can  be  used  is  that  of  determining 
the  set  of  keys  for  a  given  relation.  To  solve  this 
problem,  the  set  of  functional  dependencies  must 
either  be  known  or  found  from  the  relation.  A 
procedure  to  generate  these  dependencies  and  the  keys 
for  a  relation  is  presented  later  in  this  thesis. 


CHAPTER  III 


PROPOSITIONAL  LOGIC  AND  THE  EQUIVALENCE  THEOREM 

As  discussed  in  Chapter  II,  the  determination  of  a 
set  of  keys  for  a  relation  in  a  database  can  be  a 
difficult  task.  However,  once  a  key  has  been  located, 
the  data  stored  in  the  database  can  be  easily 
accessed.  It  would  be  very  desirable,  therefore,  to 
have  a  method  of  key  generation  for  a  relation.  The 
aim  of  the  following  discussion  is  to  present  a  method 
to  locate  the  keys  of  a  relation  in  a  database  using 
the  functional  dependencies  of  the  relation.  In  later 
chapters,  this  algorithm  will  be  used  as  a  major  part 
of  the  FD-Key  generation  algorithm.  The  method  of 
locating  the  keys  will  be  developed  by  examining  the 
equivalence  between  propositional  logic  and  database 
dependencies.  Before  this  equivalence  can  be 
discussed,  some  basic  ideas  of  propositional  logic  and 
Boolean  analysis  will  be  presented. 

Eiopgsitipnai  Lode 

Propositional  logic  deals  with  statements  that  are 
assigned  a  truth  value.  Each  statement  is  called  a 
propg^ltlpp.  and  it  can  have  only  one  truth  value, 
either  true  or  false. 
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These  statements  ace  denoted  by  BlfiBSSitifiliai 
variables  A,B,C,...,  Using  the  logic  operations  & 
(and)  and  ■>  (imply),  an  ioBiisatisn  A1&A2&...&An  -> 
B1&...&BJC  can  be  created.  This  implication  is  said  to 
be  true  if  and  only  if  all  of  the  Bj's  are  true  or  at 
least  one  of  the  A^’s  is  false.  Hence,  this 
implication  can  be  viewed  as  a  statement 
(proposition).  Normally,  &  is  represented  by  simple 
juxtaposition  of  the  variables.  For  example,  the 
above  implication  may  also  be  written  as 

A1A2...An=>Bi...B|C.  It  should  be  noted  that  in  this 
thesis  the  symbol  (*>)  is  used  for  conditional 
implication.  Normally,  this  symbol  is  used  for 
logical  implication,  and  the  symbol  (-»  is  used  for 
conditional  implication.  However,  the  symbol  (->)  is 
reserved  in  this  thesis  for  use  with  functional 
dependencies;  to  avoid  notational  confusion, 
therefore,  the  symbol  (■>)  is  used  for  conditional 
implication.  The  following  discussion  presents  some 
basic  ideas  of  propositional  logic  [8]. 

A  fundamental  inference-rule  of  propositional 
logic  is  that  of  hypothetical  syllogism.  This  rule 
states  that  the  conclusion  below  follows  from  its 
premises. 

Major  Premise:  X->Y. 
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Minor  Premise:  Y*>Z. 


Conclusion:  X»>Z. 

The  proposition  X»>Z  is  said  to  be  a  logical 
consequence  of  the  set  of  propositions  {X*>Y,Y*>Z }. 
In  general ,  we  have  the  following: 

The  proposition  F  is  a  iaaicai 
consequence  of  a  set  of  propositions  S,  if  for  every 
truth  assignment  P,  the  proposition  F  is  true  under  P 
when  all  the  propositions  of  S  are  true  under  P. 

In  propositional  logic,  deduction .  (the  generation 
of  a  conclusion  from  a  set  of  premises),  is  performed 
by  invoking  various  inference  rules.  These  rules 
state  that  a  specific  conclusion  can  be  obtained  from 
a  specific  set  of  premises.  While  these  rules  work 
and  are  useful,  a  more  simplified  method  of  deriving 
conclusions  would  be  very  useful. 

Boolean  Analysis 

Propositions  satisfy  a  set  of  mathematical  laws 
that  are  used  to  define  a  Boolean  algebra.  The 
relation  ■>  (conditional  implication)  of  propositions 
can  be  translated  into  the  relation  i  (inclusion)  of 
Boolean  algebra.  In  particular,  the  statement 
If  X  is  true,  then  Y  is  true 
can  be  represented  by  the  two  equivalent  expressions 


i 
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X->Y,  and 
XiY. 

The  information  in  these  expressions  can  also  be 
presented  in  two  types  of  equations.  These  equations 
can  either  be  in  the  "equals-zero"  or  "equals-one" 
form  of  Boolean  algebra.  The  equals-one  form  is  found 
by  complementing  the  left  side  of  the  arrow  and 
forming  the  logic  OR  of  this  result  with  the  right 
side.  For  example*  the  equals-one  form  of  the 
proposition  X»>Y  is  given  by  X'  +  Y  «  1.  The  equals- 
zero  form  is  found  by  complementing  the  right  side  of 
the  arrow  and  forming  the  logic  AND  of  this  result 
with  the  left  side  of  the  arrow.  For  the  previous 
example*  the  equals-zero  form  would  be  XY'=«0.  The 
equals-one  form  states  that  "X  is  false  or  Y  is  true" 
is  a  true  statement.  The  equals-zero  form  states  that 
"X  is  true  and  Y  is  false"  is  a  false  statement. 
Hence*  the  propositions  X»>Y,  Y*>Z,  and  X»>Z  can  be 
represented  as  Boolean  equations  XY’*0,  YZ'-O,  and 
XZ'»0*  respectively.  It  is  a  property  oi  Boolean 
algebra  that  a  sum  is  equal  to  zero  if  and  only  if 
each  of  its  summands  is  equal  to  zero;  hence*  the 
above  equations  can  be  written  as  one  equation*  i.e.* 
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XY'+YZ'+XZ'-O.  Each  of  the  above  summands  is  made  up 
of  variables.  A  single  variable,  either  complemented 
or  uncomplemented,  will  be  called  a  iitsxai,  and  the 
summands  in  the  above  equation  will  be  called  taxma. 
Each  term  consists  of  a  single  literal  or  a  product  of 
literals  in  which  no  literal  appears  more  than  once. 
A  term  p  is  included  in  a  term  q  if  all  of  the 
literals  of  q  are  contained  in  p.  An  (sum  of 

products)  formula  is  a  single  term  or  a  sum  of  terms. 
Two  important  types  of  terms  will  now  be  defined. 

Definition.  An  iqiplicant  of  a  function  F  is  a 
term  p  such  that  p  is  included  in  F. 

Definition.  A  eiima  imciisaai  of  a  function  F  is 
an  implicant  p  of  F  such  that,  for  any  term  q,  if  p  is 
included  in  q  and  q  is  included  in  F  then  p  and  q  are 
equal. 

Bigfts  aananiaai  fax®*  in  1937,  a.  Blake  [6] 
showed  that  the  sum  of  all  prime  implicants  of  a 
Boolean  function  G  is  a  canonical  form  for  that 
function.  We  shall  call  this  the  Blake  canonical  fax  IP 
for  G  and  denote  it  by  BCF(G). 

There  are  several  methods  of  generating  the  Blake 
canonical  form  of  a  Boolean  function.  This  thesis 
will  only  deal,  however,  with  the  method  of  i£exa£ad 
aanaanaaa,  which  is  based  upon  the  following 


definitions. 

Pefin-ition.  Two  terms  p  and  q  are  said  to  have  a 
literal  in  opposition  if 

(i)  term  p  contains  a  variable  A  that  is 
uncomplemented ,  and 

(ii)  term  q  contains  the  complemented  variable  A'. 

Brnfinition.  Let  two  terms  T^  and  T2  of  a  Boolean 
formula  F  have  exactly  one  literal  in  opposition, 
i.e.,  let  Ti  ■  X'P  and  T2  ■  XQ,  where  P  and  Q  are 
terms  such  that  PQ  is  not  equal  to  zero.  Then  the 
ccnaenana  of  Ti  and  T2  is  formed  from  the  product  PQ 
by 

(L)  deleting  the  two  opposing  literals  and 
(ii)  deleting  any  repetitions  of  a  literal. 

The  method  of  generating  the  BCF  of  a  Boolean  function 
using  iterated  consensus  is  given  below. 

Pef-i frit-ion.  Given  a  Boolean  formula  F,  BCF (F)  can 
be  generated  by  the  following  procedure. 

(i)  Express  F  as  an  SOP  formula. 

(ii)  Persist  in  the  following  operations  as 
long  as  possible: 

(a)  Append  to  the  formula  the  consensus 
of  two  of  its  terms,  unless  the 
consensus  is  included  in  a  term 


already  present. 


(b)  Delete  any  term  that  is  included 
in  another  term. 

Bftfiniiian*  An  SOP  formula  G  is  said  to  be 
formally  included  in  an  SOP  formula  F  if  every  term  of 
G  is  included  in  some  term  of  F. 

The  following  two  theorems  will  be  presented 
without  proofs.  For  a  more  formal  presentation,  see 
Blake  [6]. 

Stfiflxam.  An  equation  F»0  is  a  conclusion  of  the 
equation  G=0  if  and  only  if  the  function  F  is  included 
in  the  function  G. 

Theorem.  Let  F  and  G  be  SOP  formulas.  Then  F  is 
included  in  G  if  and  only  if  F  is  formally  included  in 
BCF(G)  . 

To  clarify  this  Theorem,  let  us  examine  the 
following  expressions  (hypothetical  syllogism): 


Propositions  Equations 


Major  Premise: 

X«>Y 

XY'«0 

Minor  Premise: 

Y«>Z 

YZ'-C 

Conclusion: 

x->z 

XZ’»0 

After  forming  the  SOP  formula  G  *  XT'  +  Y  Z  * , 
BCF(G)  can  be  found  by  iterated  consensus:  BCF(G)  » 
XY'  ♦  YZ'  +  XZ‘.  But,  XZ'  is  formally  included  in 
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BCF{G),  so  XZ'  *  0  is  a  conclusion  of  the  equation  G  * 
0.  This  conclusion  is  equivalent  to  the  proposition 
X»>Z,  and  hence  the  same  result  is  found  by  two 
different  but  equivalent  methods.  As  mentioned 
earlier,  a  simplified  method  for  inferring  conclusions 
was  desired.  Using  the  Blake  canonical  form  to 
generate  a  conclusion  from  a  given  set  of 
propositional  premises,  stated  as  equations,  is  such  a 
method.  For  a  more  detailed  study  of  this  procedure 
f  ee  [71. 

fisuisaitnct  fisissan  Propositional  £&gi£  find  Databases 
For  a  given  set  of  propositions  {A«>B,C=>D}  and  a 
corresponding  set  of  functional  dependencies 
{A->B,C->0},  the  syntactical  similarity  of  the  sets  is 
very  apparent.  However,  this  similarity  does  not 
necessarily  imply  that  two  corresponding  elements  of 
these  sets  are  equivalent.  Fortunately,  Sagiv,  et. 
al.  [21]  has  proved  the  following  theorem.  This 
theorem  states  that  a  set  5  of  functional  dependencies 
is  equivalent  to  a  corresponding  set  of  propositions 
S*,  where  S*  is  obtained  by  replacing  the  dependency 
symbol(->)  with  the  conditional  implication  symbol 
(■>) . 
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Ssaisaifi&££  Silfiaram.  Let  F  be  a  functional 
dependency  and  let  S  be  a  set  of  dependencies.  Then 
the  following  are  equivalent: 

(i)  The  functional  dependency  F  is  a 
consequence  of  the  set  S  of 
functional  dependencies. 

(ii)  The  proposition  F*  is  a  logical 
consequence  of  the  set  S*  of 
propositions. 

This  theorem  states  that  the  set  S  *  {A->B,B->D} 
of  functional  dependencies  has  an  equivalent  set  S*  * 
{A»>B,B»>D}  of  propositions,  which  is  generated  by 
replacing  the  symbol  (-»  with  (■».  Further,  since 
the  proposition  A«>D  is  a  logical  consequence  of  S*f 
the  equivalent  functional  dependency  A->D  is  a 
consequence  of  S. 

This  theorem  is  a  very  bold  statement.  It  allows 
any  database  problem  concerning  functional 
dependencies  to  be  solved  by  the  techniques  of 
propositional  logic  and  guarantees  the  solution  to 
hold  for  the  dependencies  of  the  database.  Since  the 
available  tools  of  propositional  logic  are  generally 
much  easier  to  implement  than  the  inference  rules  for 
dependencies,  a  very  difficult  database  problem  may 
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easily  be  solved  with  propositional  logic.  Hence,  the 
preceding  method  of  iterated  consensus  may  be  used  to 
generate  the  solutions  for  a  given  problem  concerning 
the  dependencies  of  a  database. 

Key  generation.  As  an  example  of  the  power  of 
this  theorem  let  us  examine  the  relation  R(X)  in  Fig. 
1  of  Chapter  II.  The  following  functional 
dependencies  exist  in  this  relation.  Note  that  the 
attributes  are  replaced  by  one-letter  symbols  to  make 
the  variable  manipulations  clearer. 


NAME- >COURSE 

N-  >C 

NAME , TIME- >ROOM , COURSE 

NT- >RC 

NAME , ROOM- >COURSE 

NR->C 

COURSE , TIME->NAME , ROOM 

CT->NR 

COURSE , ROOM- >NAME 

CR->N 

TIME , ROOM- >COURSE , NAME 

TR->CN 

After  writing  the  preceding  six  dependencies  in 
their  equivalent  propositional  logic  forms,  the 
following  equations  are  generated  by  complementing  the 
right  sides  of  the  equivalent  propositions  and  forming 
the  logic  AND  of  this  result  with  the  left  sides  of 
the  propositions. 


N«>C 


C'N»0 
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NT->RC 

R'NT  ^ 

C'NT«G 

NR->C 

C'NR-0 

CT->NR 

CN'T  + 

CR'T-0 

CR->N 

CN*R-0 

RT->CN 

CRT  + 

N'RT-0 

Since  these  equations  are  in  equals  zero  form, 
they  are  equivalent  to  the  single  equation  G  *  0, 
where  the  function  G  is  the  logical  sum  of  their  left 
members,  i.e., 

G  »  C'N  +  R'NT  +  C'NT  +  C'NR  +  CN'T  +  CR'T  +  CN'R  + 
CRT  +  N'RT. 

To  generate  the  keys  associated  with  a  relation,  a 
method  based  upon  the  one  developed  by  Delobel  and 
Casey  [13]  will  be  used.  For  a  given  relation,  the 
minterm  M,  which  is  the  juxtaposition  of  all  of  the 
attribute  symbols,  is  always  a  superkey  of  the 
relation.  If  K  is  the  juxtaposition  of  all  the 
attributes  of  a  key  of  the  relation  and  if  G  ■  0  is 
the  equation  representing  the  set  of  dependencies  of 
the  relation,  then  the  implication 

[G  -  0]  ■>  [K  -  M] 

defines  all  of  the  keys  of  the  relation. 

For  the  above  implication  to  be  true,  either  G  ■  0 


must  be  false,  i.e.,  G  ■  1,  or  K  must  be  equal  to  M, 


i.e.,  K  and  H  must  both  be  false,  or  K  and  N  must 
both  be  true.  Hence,  the  above  implication  can  be 
expressed  as  the  equivalent  equation 

G  +  KM  +K'M*  *  1. 

By  applying  some  generally  known  properties  of 
propositional  logic  and  Boolean  algebra  to  the  above 
equation,  the  following  equivalent  forms  can  be 
derived. 

K(M  +  G)  +  K* (M'  +G)  *  1 
K(M'G')  +  K' (MG')  -  0 
G'M  1  K  i  (G'M') ' 

G'M  1  K  i  G  +  M 

Let  us  examine  the  formula  G.  Since  G  represents 
the  original  dependencies  of  the  relation,  each  term 
of  G  will  contain  at  least  one  complemented  attribute 
symbol.  Hence,  G  may  not  include  a  minterm 
containing  only  uncomplemented  attribute  symbols.  The 
minterm  M  containing  all  of  the  attribute  symbols  in 
uncomplemented  form  is  therefore  not  included  in  G. 
Thus  M  is  included  in  G',  i.e., 

M  i  G'. 

The  foregoing  inclusion  is  equivalent  to  the  Boolean 


i 
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equation 

M  »  G'M; 

therefore,  the  expression 

G'M  IK^G+M 

is  equivalent  to 

M  i  K  i  G  +  M, 

which  is  equivalent  in  turn  to 

M  *  K  i  BCF(G  +  M)  . 

The  reason  that  BCF(G  +  M)  is  used  is  that  it  includes 
all  of  the  information  available  in  terms  that  contain 
the  fewest  possible  attribute  symbols. 

To  determine  the  keys  for  a  relation,  only  the 
terms  of  BCF (G+M)  that  contain  no  complemented 
attribute-symbols  are  considered.  This  can  be 
explained  by  re-examining  the  bounds  on  K.  The 
minterm  M  is  the  product  of  all  of  the  attribute 
symbols,  and  it  forms  the  lower  bound  on  K.  So  M  must 
be  included  in  K,  and  hence  K  can  only  include 
uncomplemented  attribute  symbols.  But  K  must  be 
included  in  BCF  (G  +  M)  ;  therefore  the  terms  of 
BCF  (G  +  M)  that  are  keys  must  contain  only 
uncomplemented  attribute  symbols. 

From  the  previous  example  for  Relation  R(X)  of 
Fig.  1  Where  C  -  COURSE,  N  »  NAME,  R  -  ROOM  NUMBER, 


and  T  ■  TIME,  the  term  M  is  found  to  be  M  *  CNRT. 
Using  the  set  {N->C,  NT->RC,  NR->C,  CT->NR,  CR->N, 
TR->CN}  of  functional  dependencies  for  this  relation, 
together  with  the  equivalence  theorem,  the  set  {N»>C, 
NT->RC,  NR*>C,  CT->NR,  CR*>N,  TR»>CN}  of  equivalent 
propositions  is  generated.  The  formula 

G  »  N' RT  +  C'RT  +  C 1  NT  +NR'T  +  CN'T  +CR'T  +CN'R 
is  produced  by  converting  each  proposition  into  an 
equation  of  equals-zero  form,  forming  the  sum  G  =  0  of 
all  of  these  equations,  and  writing  the  formula  G. 
Adding  the  term  M  to  G  and  calculating  BCF (G  +  M) 
yields  the  result 

BCF  (G+M)  a  CT  +  RT  +  NT  +  CN  +  CN'R. 

Using  the  expression 

M  1  K  £  BCF  (G  +  M) 

for  the  bounds  on  the  unknown  key  K,  the  relation 
CNRT  i  K  i  CT  +  RT  +  NT  +  C*N  +  CN'R 
is  generated.  Now  by  examining  the  terms  of  BCF(G+M) 
containing  no  complemented  variables,  the  keys  CT,  RT, 
and  NT  for  the  relation  R(X)  of  Fig.  1  are  found. 
Also,  the  superkeys  of  a  relation  can  be  found  by 
concatenating  any  number  of  uncomplemented  attribute 
symbols  in  the  relation  to  the  symbols  of  a  key. 
Therefore,  CNT,  CRT,  NRT,  and  CNRT  are  superkeys  of 
the  relation; 
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If  the  set  of  functional  dependencies  for  a 


relation  is  known,  the  above  procedure  will  generate 
all  keys  and  superkeys  that  exist  in  the  relation.  If 
an  algorithm  existed  to  generate  the  functional 
dependencies  of  a  relation,  then  the  generation  of 
keys  for  a  relation  could  be  implemented  on  a  computer 
or  dedicated  processor  designed  to  perform 


CHAPTER  IV 


GENERATION  OF  FUNCTIONAL  DEPENDENCIES 

For  the  key  generation  algorithm  of  Chapter  III  to 
be  applied,  the  functional  dependencies  of  a  relation 
must  be  known.  We  present  in  this  chapter  a  procedure 
for  generating  the  functional  dependencies  of  a 
relation  directly  from  the  rows  (tuples)  defining  that 
relation.  When  combined  with  the  key-generation 
algorithm,  this  procedure  enables  the  keys  of  a 
relational  database  to  be  derived  quickly  and 
conveniently.  We  call  the  combined  procedure  the  £ft- 
fi£X  Algorithm. 

The  FD-Key  algorithm  may  be  used  to  solve  a  number 
of  problems.  Suppose  a  programmer  were  assigned  the 
task  of  setting  up  a  large  database;  then  the 
generation  of  the  keys  could  be  a  very  tedious  and 
time  consuming  task.  By  using  the  FD-Key  algorithm, 
the  programmer  could  simply  insert  the  database  into 
the  computer  system,  execute  the  algorithm,  and 
receive  the  keys  and  functional  dependencies  of  the 
database  as  outputs. 

As  another  example,  suppose  a  database  could  be 
updated  by  several  different  users.  That  is,  several 
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users  could  be  changing  data  values  of  tuples  in 
the  database.  This  process  might  create  anomalies 
in  the  database.  Hence,  a  functional  dependency  for 
the  original  database  might  no  longer  hold  in  the 
updated  version  of  the  database.  This  type  of  error 
may  be  detected  in  the  following  manner.  After  each 
change  of  data,  the  FD-Key  algorithm  could  be  executed 
on  the  new  database.  If  the  functional  dependencies 
generated  from  the  new  database  were  different  from 
the  functional  dependencies  of  the  original  database, 
then  the  recent  data  changes  had  violated  the 
integrity  of  the  database.  Therefore,  the  data-updates 
should  be  examined  for  an  error. 

This  chapter  will  present  the  section  of  the  FD- 
Key  algorithm  that  generates  the  functional 
dependencies  of  a  relation.  To  fully  understand  the 
operation  of  this  section  of  the  algorithm,  the 
flowcharts  of  Appendix  A  and  the  examples  contained  in 
this  chapter  should  be  closely  examined. 

The  algorithm  makes  extensive  use  of  partitions, 
whose  definition  we  now  recall  [16]. 

BaflnUritm.  A  part-itlon  P  of  a  non-empty,  finite 
set  S  is  a  collection  of  non-empty  subsets  of  S.  The 
partition  is  denoted  by  P  ■  { B* , B2 r ... rB* } ;  the 
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subsets  Blr...,B2  are  called  the  feiaafca  of  the 
partition  P.  The  blocks  of  a  partition  must  satisfy 
the  following  two  conditions. 

(i)  The  intersection  of  any  two  blocks,  and 
Bj  for  i  not  equal  to  j,  is  the  empty  set. 
(ii)  The  union  of  all  the  B^'s  is  the  set  S. 

generation  Algorithm 

The  problem  to  be  solved  can  be  stated  as  follows: 
given  a  relation  R  and  a  specific  attribute  A 
contained  in  the  relation,  generate  the  functional 
dependencies  of  R  that  contain  A  on  the  right  side  of 
the  arrow.  The  desired  dependencies  will  have  the 
form  X— >A,  where  X  may  be  the  concatenation  of  several 
attribute  symbols.  By  continuing  this  process  for  all 
of  the  attributes,  the  set  of  functional  dependencies 
for  the  given  relation  will  be  generated. 

The.  algorithm  to  generate  the  functional 
dependencies  manipulates  the  data  in  the  relations  of 
the  database  in  three  different  ways.  The  first  data 
manipulation  involves  partitioning.  Specifically,  the 
tuples  of  the  original  relation  are  placed  into  a 
number  of  relations  containing  two  sub-relations,  each 
of  which  is  generated  by  using  a  two-block  partition 
of  the  data  values  for  an  attribute  in  the  original 
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relation.  After  formation  of  the  sub-relations,  the 
second  type  of  data  manipulation  is  performed.  Here, 
the  tuples  in  each  pair  of  sub-relations  are  compared 
and  a  Boolean  sum  of  attribute  symbols  is  generated. 
This  sum  actually  represents  the  key  for  the  two 
tuples  in  the  sub- relations  being  examined.  By 
repetitive  generation  of  these  sums  for  each  pair  of 
tuples,  the  keys  of  the  pair  of  sub-relations  can  be 
found.  After  all  of  these  sums  are  generated  for  the 
pair  of  sub-relations,  a  product  of  sums  (POS)  formula 
is  generated  by  forming  a  Boolean  product  of  all  of 
the  sums.  Each  pair  of  sub- relations  will  be 
subjected  to  this  procedure,  and  a  group  of  POS 
formulas  will  be  generated.  Each  of  these  formulas 
will  either  be  zero,  (indicating  that  no  key  exists 
for  the  pair  of  sub- relations)  or  a  product  of  sums. 
If  a  formula  is  zero,  this  means  that  no  functional 
dependency  can  be  found  for  the  chosen  attribute  of 
the  original  relation.  If  all  of  the  formulas  are 
products  of  sums,  however,  a  third  type  of  data 
manipulation  is  needed  to  generate  the  functional 
dependencies  of  the  relation. 

This  final  data  manipulation  involves  some 
techniques  of  Boolean  algebra.  All  of  the  POS 
formulas  generated  from  the  pairs  of  sub-relations  are 


multiplied  together  to  form  one  large  POS  formula. 
This  last  POS  formula  is  converted  to  a  sum  of 
products  (SOP)  formula  and  simplified  as  much  as 
possible.  The  SOP  formula  now  contains  all  of  the 
information  needed  to  produce  the  set  of  functional 
dependencies  for  the  original  relation.  Each  term  of 
the  SOP  formula  contains  the  attribute  symbols  that 
are  on  the  left  sides  of  the  arrows  in  the  functional 
dependencies  that  have  the  chosen  attribute  from  the 
original  relation  on  the  right  side  of  the  arrow. 

In  the  development  of  the  functional  dependencies 
for  a  relation,  the  following  three  assumptions  will 
be  made. 

(1)  The  data-values  of  the  relation  will  not  be 
updated  during  the  development  period. 

(2)  The  relation  will  contain  a  finite  number  of 
tuples. 

(3)  The  relation  has  a  finite  number  of 
attributes. 

The  algorithm  to  generate  the  functional 
dependencies  of  a  relation  will  now  be  presented  in 
seven  steps. 

StfiQ  i.  Choose  an  attribute  to  appear  on  the  right 
side  of  the  functional  dependencies. 
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Step  £.  Generate  the  set,  L,  consisting  of  one 
entry  for  every  unique  data  value  in  the  column  under 
the  chosen  attribute. 

S&£2  Generate  a  sequence  (PQ * , Pn*)  of 
partitions  of  the  set  L  in  the  following  manner: 

(i)  Let  i*0.  P^*  contains  all  the  data  values  of 
the  set  L  in  one  block. 

(ii)  If  the  number  of  elements  in  the  largest  block 
of  P^*  is  less  than  or  equal  to  two,  stop  the 
operations.  Otherwise  do  (iii). 

(iii)  Let  i*i+l,  and  generate  a  new  Pj_*  that  contains 
2*  blocks.  The  blocks  in  the  new  partition 
are  foundby  splitting  each  of  the  blocks  in 
the  preceding  partition  into  two  disjoint 
blocks  whose  cardinality  differs  at  most  by 
one.  If  a  block  in  the  preceding  partition 
contains  an  odd  number  of  elements,  the  left 
block  of  the  new  pair  of  blocks  will  contain 
one  more  element  than  the  new  right  block.  Go 
to  (ii)  and  repeat. 

st-gp  ±,  Generate  another  sequence  of  partitions 
(Pi,...,?*) »  of  two  blocks  each,  in  the  following 

(i)  Let  i  «  1,  and  j  ■  1.  Set  P^  >  Pj*. 


manner: 


(ii)  Pi+1  is  made  up  of  two  blocks  such  that 
the  left  block  contains  the  left  half  of 

each  block  of  Pj*.  The  right  block  of 

* 

Pi+1  contains  all  the  elements  in  Pj 
not  present  in  the  left  block  of  Pi+i. 
If  a  block  of  Pj*  contains  an  odd  number 
of  elements ,  the  extra  element  is  placed 
in  the  left  block  of  Pj.+i* 

(iii)  If  Pj*  is  the  last  partition,  stop  the 
procedure.  Otherwise,  let  i  *  i+1,  and 

j  -  j+1. 

The  previous  two  steps  have  been  designed  to 
provide  maximum  skewing  of  the  partitions.  That  is, 
the  number  of  elements  in  the  left  block  of  each  P^  iS 
as  large  as  possible.  This  procedure  will  minimize 
the  number  of  data  comparisons  necessary  in  the  sixth 
step  of  the  total  algorithm. 

SJfcBB  5.  Generate  n  copies  of  the  relation  being 
tested,  where  n  is  the  number  of  generated  P^ 
partitions.  Split  each  copy  of  the  relation  into  two 
according  to  the  data  values  found  in 
the  blocks  of  the  P^'s.  Delete  the  columns 
corresponding  to  the  attribute  being  tested  from  each 
of  these  copies.  Let  and  R^  denote  the  ith  pair 
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of  sub-relations,  where  the  subscripts  il  and  i2 
represent  the  tuples  associated  with  the  data  values 
contained  in  blocks  one  and  two,  respectively,  of 
partition  P^<  The  generation  of  the  sub-relations  is 
outlined  below. 

(i)  Let  i»l,  j  *  the  numerical  position  of  the 
chosen  attribute  in  the  list  of  attribute 
symbols,  P^  =  the  ith  two-block  partition  of 
the  set  of  data  values  for  the  chosen 
attribute,  and  T1  =  the  first  tuple  of  the 
relation. 

(ii)  If  the  jth  data  item  of  T1  is  in  the  left  block 

of  Pi,  then  place  T1  in  the  sub-relation  Rii» 
Otherwise,  place  T1  in  the  sub-relation  Ri2. 

(iii)  If  T1  is  the  last  tuple  of  the  original 
relation,  then  go  to  (iv).  Otherwise,  let  T1  * 
the  next  tuple  of  the  relation,  and  go  to  (ii). 

(iv)  If  i»n,  stop  this  procedure.  Otherwise,  let 
i*i+l,  let  T1  ■  the  first  tuple  of  the  relation 
and  go  to  (ii). 

Step  £.  Generate  a  Boolean  formula  El  for  each 
sub-relation  R^.  The  formula  generated  will  be  in  a 
C£SdS££  a£  sums,  POS,  form.  That  is,  a  sum  of 
literals  logically  multiplied  by  other  sums  of 
literals.  These  formulas  are  generated  by  the 


i 


following  procedure. 

(i)  Let  i«l,  obtain  both  the  first  tuple  t^ 
of  the  sub-relation  R^  and  the  first 
tuple  t2  of  Ri2»  If  Ri  doesn't  exist,  go 
to  Step  7. 

(ii)  If  t1»t2,  abort  the  dependency  algorithm 
since  no  functional  dependencies  exist 
with  the  chosen  attribute  on  the  right 
side  of  the  arrow.  Otherwise  go  to 

( iii) . 

(iii)  Compare  the  data  values  under 
corresponding  attributes  of  each  tuple. 
If  any  pair  of  data  values  are  distinct, 
insert  their  attribute  name  into  a  sum  S. 
Continue  this  procedure  until  all  pairs 
of  data-values  in  t^  and  t2  have  been 
exhausted,  then  go  to  step  (iv). 

(iv)  Insert  the  sum  S  as  a  product  in  the  POS 
formula  El.  Go  to  step  (v). 

(v)  If  t2  was  the  last  tuple  of  Rj.2  and  ti 
was  the  last  tuple  of  Rilr  let  i»i+l  and 
go  to  step  (i).  If  t2  was  the  last  tuple 
of  r12  and  tx  was  not  the  last  tuple  of 
Ril'  replace  t^  with  the  next  tuple  of 


36 


Ril  and  t2  with  the  first  tuple  of  Rj^y 
then  go  to  step  (ii).  If  t2  was  not  the 
last  tuple  of  Ri2*  replace  t2  with  the 
next  tuple  of  R^2  an<*  9°  to  step  (ii). 
It  was  stated  previously  that  maximum  skewing  of 
the  partitions  will  minimize  the  number  of  tuple 
comparisons.  For  example,  assume  the  relation 
contains  n+k  tuples  such  that  the  sub- relations  R^ 
and  R^2  contain  n  and  k  tuples,  respectively.  If  k  is 
much  smaller  than  n,  so  that  k»n-p,  where  p>0,  then 
the  number  of  tuple  comparisons  required  to  generate 
the  formula  El  is  nxk-nx(n-p) »n2-np.  But  if  k  and  n 
are  equal,  then  the  number  of  comparisons  required  is 
nxn»n2.  And  if  k»n+l,  then  the  number  of  comparisons 
would  be  even  larger,  i.e.,  nxk*n(n+l) «n2+n.  So  it  is 
clearly  evident  that  maximum  skewing  of  the  partitions 

pi's  is  necessary  to  minimize  the  number  of  tuple 
comparisons. 

Step  2..  Generate  the  functional  dependencies  for 
the  relation.  These  dependencies  will  contain  the 
chosen  attribute  on  the  right  side  of  the  arrow.  The 
procedure  is  outlined  below. 

(i)  Let  the  function  EA  be  composed  of  the 
product  of  all  the  El  functions 
previously  found  in  the  sixth  step  of  the 
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algorithm. 

(ii)  Convert  the  POS  form  of  EA  to  a  SOP  form 
by  multiplying  the  products  and  deleting 
any  terms  that  contain  all  the  literals 
of  another  term  in  EA. 

(iii)  Create  a  dependency  with  each  term  of 
EA  and  the  attribute  A,  and  place  each  of 
these  dependencies  into  a  set  DEP.  These 

dependencies  will  be  of  the  form  Ti“>A, 
and  DEP  will  be  equal  to  {T1->Ar...r 
Tm->A}.  The  set  DEP  now  contins  all  of 
the  dependencies  from  the  original 
relation  that  have  the  chosen  attribute 
on  the  right  side  of  the  arrow. 

To  clarify  the  operation  of  the  algorithm,  the 
relation  R(X)  in  Pig.  3  will  be  used  to  generate  an 
example  for  each  step  of  the  algorithm.  This  relation 
can  be  found  in  [21].  Also,  some  supplementary 
examples  are  given  to  clarify  steps  of  the  algorithm 
that  are  overly  simple  when  applied  to  this  relation. 
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Fig.  3.  Relation  R(X) . 

The  first  step  of  the  algorithm  states  that  an 
attribute  to  appear  on  the  right  side  of  the 
functional  dependencies  must  be  selected.  So,  in  the 
example  relation  R(X) ,  the  attribute  B  will  be  chosen. 
The  second  step  of  the  algorithm  generates  the  set  L 
of  data  values  associate*,  with  this  attribute.  For 
our  example,  this  set  is  L»{blfb2,b3 }. 

Performing  the  next  step  of  the  algorithm  on  our 
relation  the  sequence  of  partitions  (Pg*,?^*)  will  be 
generated.  These  partitions  are  sets  that  contain 
other  sets,  hence 

V  ■  Hb1b2b3}} 

Pi*  *  { {bib2} . {b3 

is  the  correct  manner  of  representing  these 
partitions.  This  notation  contains  many  braces,  and 
it  needs  to  be  simplified.  Henceforth,  all  partitions 
will  be  denoted  by  deleting  the  braces  of  the  blocks 
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of  the  partitions,  e.g., 

V  *  {b1b2b3} 

pl*  *  {bib2,b3>. 

To  further  clarify  this  step  of  the  algorithm,  let 
us  examine  the  flowchart  of  Fig.  9  and  another  set  of 
data  values,  namely,  LI  *  {ax ,a2 ,a3 ,a4 ,a5 ,a6 ,aj ,a8 , 
a9,a10  'an  ,ai2  The  following  partitions  will  be 
generated  by  this  step  of  the  algorithm. 

p0*  *  {ala2a3a4a5aSa7a8a9a10alla12* 

Pl*  *  ^ala2a3a4a5a6  ,a7a8a9a10alla12 } 
p2*  *  (ala2a3 ,a4a5a6 ,a7a8a9,aioalla12^ 

p3*  *  {aia2,a3,a4a5,a6,a7a8ra9,aloall'a12} 

Now,  performing  the  fourth  step  of  the  algorithm 

on  the  sequence  of  partitions  generated  from  the  set 
L,  the  partitions 

P1  *  pl*  *  tbib2,b3} 
p2  *  {b].b3,b2} 

will  be  generated.  Again,  a  flowchart  and  another 
example  is  given  to  clarify  this  step  of  the 
algorithm.  Examining  the  flowchart  of  Fig.  10  and 
the  sequence  of  partitions  for  the  set  LI,  the 
sequence 

pl  «  pl* 

p2  *  {ala2a3a7a8a9ra4a5a6a10alla12) 

P3  •  Ula2a4a5a7a8a10all 'a3a6a9a12* 


P4  *  {aia3a4a6a7agaioa12 »a2a5a8all ^ 
of  partitions  is  produced. 

The  fifth  step  of  the  algorithm  will  generate 
several  copies  of  the  original  relation  R(X).  Dsing 
the  partitions  Px  and  P2  of  the  set  L,  the  copies  Ri' 
and  R2*  of  Fig.  4  for  relation  R(X)  of  Fig.  3  will  be 
generated.  This  figure  shows  how  the  two  copies  R^1 
and  R2 '  appear  before  the  columns  corresponding  to 
attribute  B  are  deleted. 

The  first  copy  Rx*  0f  R(X)  is  partitioned  in  the 
following  manner.  Block  one  of  partition  P^  contains 
the  data  values  and  b2.  Therefore,  any  tuple  of 
R(X)  containing  these  data  values  under  the  column  B 
will  be  placed  in  the  sub-relation  R^.  Since  block 
two  of  P^  contains  only  b3,  the  sub-relation  Rj2 
only  contain  tuples  that  have  the  data  value  b2  under 
the  column  B.  Similarly  for  the  second  copy  R21 
R(X) ,  R21  will  contain  tuples  that  have  the  data 

values  bi  and  b3  under  the  attribute  B.  Likewise,  R22 
will  only  contain  tuples  that  have  the  value  b2  in  the 
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(b)  Copy  R2*  of  relation  R(X)  . 

Fig.  4.  Copies  of  R(X)  partitioned  according  to  Pi's. 

After  the  column  of  data  associated  with  the 
attribute  B  is  deleted,  the  new  relations  Ri  and  R2 
are  as  shown  in  Fig.  5. 
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(b)  Copy  R2  of  rela'.ion  R(X)  . 

Pig.  5,  Copies  of  R(X)  with  attribute  B  removed. 

The  sixth  step  of  the  algorithm  will  be 
illustrated  by  examining  the  two  relations  of  Fig.  5 
and  the  flowchart  of  Fig.  12.  In  the  generation  of  El 
for  relation  Ri,  the  tuples  ti«<ai , ci ,dj >  and 
t2»<a2fcifdi>  are  the  first  two  tuples  examined.  Now 
by  comparison  of  data  values,  it  is  found  that  only 
attribute  A  will  be  in  the  first  sum  S.  At  this 
point,  therefore,  El  consists  of  only  one  sum,  (A). 


After  performing  the  tests  in  step  (v) ,  the  tuple  t2 
will  be  changed  to  t2*<a2fC]j,d2>»  and  t^  will  remain 
the  same.  These  two  tuples  are  compared,  and  the  new 
sum  S«A+D  is  generated.  The  new  sum  S  is  placed  in  the 
formula  El,  and  this  changes  El  to  E1«(A)  (A+D) .  Again 
the  tests  in  step  (v)  are  performed,  and  this  time 
both  t-^  and  t2  will  be  changed  to  ti*<ai,C2rd2>  and 
t2*<a2,ci,di>.  After  all  the  tuples  of  have  been 
examined,  the  formula  El  is  found  to  be 

£1«(A)  (A+D)  (A+C+D)  (A+C)  (A+D)  (A)  (A+C)  (A+C+D)  , 
which  is  equivalent  to  the  formula  E1*A.  Performing 
the  same  operations  on  the  relation  R2  will  yield  the 
result 

E2-(C+D) (C) (C) (C+D) (A+C+D) (A+C) (A+C) (A+C+D) , 
which  is  equivalent  to  E2*C.  For  the  final  step  of  the 
algorithm,  the  formula  EB  and  its  derivation  is  shown 
below. 

EB* (El) ( E2) 

EB»(A) (C)-AC 

Therefore  the  set  DEP  contains  only  tin  ctional 
dependency  AC->B.  By  repeated  application  of  this 
algorithm  for  the  other  attributes,  the  complete  set 

(B->A,  AC->B,  B->C} 

of  functional  dependencies  for  relation  R(X)  of  Fig.  3 


can  be  produced. 

Recalling  the  key  generation  algorithm  presented 
in  the  preceding  chapter,  the  following  results  can  be 
found  from  the  set  of  functional  dependencies.  Since 
the  relation  R(X)  contains  the  attributes  A,  B,  C,  and 
D,  the  term  M  will  be  equal  to  ABCD,  and  from  the 
preceding  set  of  dependencies  the  equivalent  equation 
G  *  0  can  be  derived,  where  G  *  A'B  +  AB'C  +  BC. 
Using  the  technique  of  iterated  consensus  on  M  and  G, 
the  following  results  can  be  obtained. 

BCF(G+M)  *  A'B  +  AB'C  +  BC'  +  BD  +  ACD 

ABCD  1  K  i  A'B  f  AB'C  +  BC  +BD  +  ACD 

Therefore  the  set  of  keys  for  the  relation  R(X)  of 
Fig.  3  is  {BD,ACD} ,  and  the  corresponding  set  of 
superkeys  is  {ABD, BCD, ABCD}.  The  results  generated 
above  may  also  be  generated  by  visual  observation  for 
this  relation.  A  much  larger  relation  may  be  very 
hard  to  analyze  visually,  but  the  preceding  algorithm 
will  always  generate  the  desired  results.  The 
implementation  of  this  algorithm  would  be  very  easy  in 
a  language  designed  for  logic  programming  and 
character  string  manipulations.  Fortunately,  the 
programming  language  PROLOG  has  these  capabilities  and 
it  is  very  easy  to  operate  from  a  user's  view.  This 
language  is  presented  in  the  next  chapter  and  the 


CHAPTER  V 


INTRODUCTION  TO  PROLOG 


Basis  Stzasfcnis 

In  the  past  few  years,  several  logic  programming 
languages  have  been  developed.  One  of  the  most 
powerful  of  these  is  PROLOG,  a  programming  language 
based  on  predicate  calculus;  this  language  was 
developed  at  the  University  of  Marseille  starting 
around  1970.  A  later,  interactive  version  of  PROLOG 
was  implemented  on  the  DECsystem-10  in  1977  [22], 
This  newer  version,  containing  both  an  intepreter  and 
a  compiler,  allows  the  user  to  easily  write  clear, 
readable,  and  concise  programs.  The  interpreter  aids 
in  the  quick  development  and  testing  of  programs,  and 
also  allows  access  to  compiled  programs.  The  compiler 
produces  code  that  executes  ten  to  twenty  times  faster 
than  the  interpreter,  but  it  is  advisable  to  compile 
only  well-tested  programs.  Any  compiled  program  can 
easily  be  provided  with  an  interpretative  interface  to 
the  programmer.  He  present  a  brief  summary  in  this 
chapter  of  the  features  of  PROLOG;  for  a  more  detailed 
description,  see  [20]  and  [11]. 
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eorapomenta  of  grtyioq 

Generally,  any  object  in  PROLOG  can  be  called  a 
term.  A  tern  can  either  be  a  constant,  a  varf able .  or 
a  componnd  -term.  A  constant  can  be  any  integer  between 
-131072  to  131071  or  an  alas*  The  integers  can  be 
written  in  any  base  from  two  to  ten.  An  atom  can  be 
any  sequence  of  characters,  and  any  possible  confusion 
with  other  terms  should  be  eliminated  by  enclosing  the 
sequence  in  quotes.  For  example,  'Rabbit',  rabbit, 
[]  ,  and  *  are  all  atoms. 

A  variable  is  distinguished  by  an  initial  capital 
letter  or  the  leading  character  Whenever  a 
variable  is  only  referenced  once,  it  can  be  denoted  by 
the  single  character  For  example.  Rabbit,  X, 
_32 ,  .result,  and  _  are  all  variables. 

A  compound  term  is  formed  with  a  functor  of  some 
arity  greater  than  one.  The  arity  of  a  functor  is  the 
number  of  terms  used  as  arguments.  In  the  term 
member (X, [B |T])  for  example,  the  functor  "member”  has 
an  arity  of  two  since  X  and  [H|T]  are  the  two  terms 
used  as  arguments.  The  term  [H|T]  represents  a  list, 
where  B  is  the  first  element  and  T  is  the  tail  or  all 
remaining  elements  in  the  list.  An  atom  may  be 
considered  as  a  functor  of  arity  zero. 

The  names  and  arities  of  functors  are  totally 
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arbitrary.  That  is,  the  programmer  can  introduce  as 
many  different  arguments  for  a  desired  functor  as 
needed.  PROLOG  contains  several  built-in  functors 
used  to  perform  basic  system  operations. 

A  PROLOG  program  consists  of  a  set  of  procedures 
which  contain  clauses.  These  clauses  are  made  up  of 
terms,  organized  into  two  basic  forms.  The 
propositional  logic  form  of  the  first  type  of  clause, 


called  a 


Horn  clause .  is  of  the  form 
A<«Bi&B2&B3 


where  A  is  called  the  head  of  the  clause  and  B  is 
called  the  body  of  the  clause.  This  clause  is  read  ”A 
is  true  if  B^  and  B2  and  B3  are  true.”  A  conditional 
horn  clause  may  also  have  the  form 

A<»C^+C2 

where  this  clause  is  read  "A  is  true  if  or  C2 
true."  The  second  type  of  Horn  clause,  known  as  a 
unit  danse,  is  a  true  statement  such  as 


which  is  read  "A  is  true." 

The  PROLOG  language  requires  the  head  of  a  clause 
to  be  separated  from  the  body  by  the  symbol  which 
represents  the  word  "if"  in  a  logic  statement.  Also, 


any  clause  mupt  end  with  a  period.  For  example,  the 


three  preceding  clauses  translated  into  PROLOG  would 
be  written  as 

A:-Bl , 02 » ®3 • 

A:-Ci;C2* 

A. 

These  clauses  taken  together  can  be  viewed  as 
procedure  A,  where  B3,  B2,  B3,  C3,  and  C3  are  goals  or 
other  procedures  to  be  called  by  the  PROLOG  program. 
The  goals  in  the  body  o£  a  procedure  are  separated  by 
the  symbols  or  which  represent  logical 

conjunction  and  disjunction,  respectively.  These  goals 
are  procedures  that  impose  conditions  upon  the  head  of 
the  clause. 

PROLOG  also  contains  provisions  for  expressing 
grammar  r tries.  These  rules  provide  an  easy  method  of 
parsing  strings  into  specific  components,  and  using 
these  components  in  any  manner  specified  by  the 
program.  The  typical  grammar  rule  has  the  form,  LHS — 
>RHS,  and  it  is  read  as  "a  possible  form  for  the  left 
hand  side  is  the  right  hand  side.”  Any  PROLOG 
procedure  can  be  used  as  a  condition  on  the  right  side 
by  simply  enclosing  the  procedure  in  braces, 
Grammar  rules  may  seem  very  confusing  when  first 
encountered,  but  they  can  be  written  as  ordinary 
PROLOG  clauses.  For  example,  the  grammar  rule 


p (X)  -- >q (X)  can  be  translated  into  the  clause 
p(X,Sl,S)  :-q(X,Sl,S) .  As  an  example,  the  procedure 

delim— 
delim — > [] . 

is  a  rule  to  remove  the  character  "+"  from  a  list  of 
characters.  If  "+"  were  not  the  first  character  in 
the  list,  the  original  list  would  be  the  result. 

Sides  Ssaantisa 

PROLOG  semantics  can  be  presented  in  two  different 
ways.  The  procedural  semantics  describes  the  sequence 
of  states  through  which  the  program  passes  during  an 
execution,  and  the  declarative  semantics  allows  the 
program  to  be  broken  down  into  many  independent 
programs  or  procedures.  These  smaller  procedures  are 
usually  clear  and  easily  executed. 

The  declarative  semantics  makes  no  reference  to 
the  ordering  of  clauses  or  procedures  in  a  goal  or 
program.  This  type  of  semantics  is  used  to 
recursively  define  the  conditions  necessary  for  the 
head  of  a  clause  to  be  tree.  That  is,  the  head  of  a 
clause  is  true  if  all  of  the  terms  in  the  body  of  the 
clause  are  also  true,  and  each  term  is  true  if  it,  in 
turn,  is  the  head  of  a  clause  instance  which  is  true. 
Also  a  term  in  the  body  of  a  clause  may  be  a  compound 


51 


term  which  is  the  disjunction  of  two  other  terms.  For 
example,  the  clause 

A:-B;C. 

is  true  if  the  compound  term  B;C  is  true,  and  the 
compound  term  is  true  if  either  B  or  C  is  true. 

The  procedural  semantics  depends  upon  the  ordering 
of  clauses  in  a  program,  and  the  goals  in  a  clause, 
for  crucial  program  control  information.  The 
execution  of  the  program  depends  upon  this 
information,  and  the  reordering  of  a  set  of  goals  or 
clauses  may  completely  change  the  function  of  a  clause 
or  program.  The  execution  of  a  goal  is  performed  by 
searching  for  the  first  clause  whose  head  matches  the 
goal.  This  is  done  in  a  top-down  fashion.  That  is, 
the  matching  starts  at  the  top  of  the  program  and 
continues  until  a  match  is  found.  If  a  match  is 
found,  the  goals  in  the  body  of  the  clause  are 
executed  from  left  to  right  in  the  same  manner.  If  no 
match  is  found,  the  system  bacfctracfra  to  the  most 
recent  clause,  discards  any  substitutions  caused  by 
that  clause,  and  the  search  for  another  match  of  the 
original  clause  is  continued  from  this  clause  down 
through  the  rest  of  the  program. 

There  is  one  other  type  of  control  information 


available  in  PROLOG,  called  the  pet  symbol.  This 
symbol,  ”1",  is  used  as  a  goal  in  a  clause  and  always 
succeeds  when  it  is  first  called.  If  PROLOG  ever 
backtracks  to  the  cut  symbol,  the  goal  that  caused  the 
clause  containing  the  cut  symbol  to  be  called  will 
always  fail.  This  symbol  allows  the  programmer  to 
force  a  goal  to  succeed  or  fail  after  it  has  been 
partially  executed. 

fcampiPS  Sf  Sxsias  Programs 

Two  simple  examples  of  PROLOG  programs  will  be 
presented.  The  first  example  will  consist  of  a 
program  to  solve  the  following  logic  problem. 

Bob  likes  logic. 

Mary  likes  logic. 

Bob  likes  anyone  who  likes  logic. 

What  does  Bob  like? 

The  PROLOG  program  will  consist  of  two  unit  clauses 
and  one  conditional  clause  involving  the  predicate 
"likes": 


likes (bob, logic)  . 
likes(mary, logic) . 
likes (X,Y) :-likes(Y, logic) 
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After  the  program  has  been  interpreted  by  the 
computer ,  the  input  query 

likes (bob,X)  . 
will  yield  the  results: 


X«logic ; 

X-bob; 

X=mary 

Here,  the  symbol  ";"  is  used  to  request  an  alternate 
answer  for  the  query  after  one  answer  has  been  found. 

As  another  example,  consider  the  problem  of 
concatenating  two  lists  together  to  form  a  third  list. 
The  procedure  could  be  formulated  as  follows: 


concatenate ( [] ,L,L) . 
concatenate ( [X IL] ,T, [X |K]) 
concatenate (L,T,K) . 


The  predicate  "concatenate"  is  defined  by  the  program; 
that  is,  PROLOG  does  not  know  what  this  procedure 
means  until  it  receives  these  statements.  However, 
the  symbols  []  and  [|]  are  known  to  the  language.  The 
first  clause  states  that  the  empty  list  concatenated 
with  a  second  list  is  simply  the  second  list.  The 
second  clause  states  that  the  list  [X|L]  concatenated 
with  the  list  T  is  the  list  [ X  I K ]  if  the  list  L 
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concatenated  with  the  list  T  is  the  list  K.  When  the 
query 

concatenate ( [a,b,c] , [d,e,f ] ,K) 
is  presented  to  this  program,  the  variable  K  will  be 
returned  as  the  list  [a,b,c,d, e, f  ] .  The  above 
procedures  are  but  two  of  many  possible  examples,  and 
it  should  be  noted  that  an  excellent  source  [11]  of 
programming  examples  exists. 

The  next  chapter  contains  a  description  of  the 
operation  of  the  FD-Key  algorithm  presented  in  the 
preceding  chapters.  This  description  contains  the 
syntax  rules  that  must  be  obeyed  for  proper  operation 
of  the  program,  and  some  possible  modifications  that 
the  user  may  wish  to  use. 


CHAPTER  VI 


SYSTEM  REQUIREMENTS  AND  SYNTAX 

The  purpose  of  this  chapter  is  to  present  the 
syntax  rules  and  system  requirements  for  the  correct 
implementation  of  the  FD-KEY  algorithm  as  it  is 
currently  programmed  using  the  language  PROLOG. 

SXS&SIB  require mentq.  For  the  FD-Key  program  in 
Appendix  C  to  run  correctly,  the  user's  computing 
system  should  meet  the  following  requirements.  First, 
Version  3  of  Dec-10  PROLOG  or  its  equivalent  must  be 
used.  Otherwise,  several  clauses  in  the  program  will 
not  function  correctly.  For  example,  any  clause  that 
uses  the  built-in  predicate  'read*  to  input 
information  from  a  data  file  will  usually  have  a  test 
for  the  end  of  file  marker,  'end_of_f ile'.  If  an 
earlier  version  of  PROLOG  is  used,  this  marker  may  be 
':-end',  and  the  program  will  never  cease  to  input 
information  from  the  data  file.  Therefore,  a  loop 
will  be  created,  and  the  program  will  either  fail  or 
yield  an  erroneous  result. 

The  other  requirement  is  that  the  user's  system 
must  have  an  adequate  amount  of  memory  storage 
available.  This  is  because  the  FD-Key  program 
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generates  several  files  for  data  input  and  output 
during  execution.  Also?  the  program  and  its  compiled 
version  require  several  blocks  of  storage.  It  should 
be  noted  that  the  program  deletes  all  of  the  data 
files  created  during  execution.  The  only  exception  to 
this  is  the  file  'propa*,  which  contains  the 
functional  dependencies  of  the  input  relation. 

FB-Key  aiqo-rit-hm.  The  FD-Key  algorithm  consists 
of  two  main  routines.  The  first  routine  is  the 
functional  dependency  algorithm,  presented  in  Chapter 
IV.  The  second  routine  is  the  key  generation  algorithm 
presented  in  Chapter  III.  The  operation  of  these 
algorithms  is  explained  in  the  following  section  of 
this  chapter. 

The  FD-Key  algorithm  is  designed  to  perform  the 


following 

tasks: 

(i) 

Input  a  list  of  attribute  symbols  for  a 

given  relation. 

(ii) 

Output  the  functional  dependencies  for 

the  relation. 

(iii) 

Output  the  keys  for  the  relation. 

(iv) 

Output  run-times  for  various  routines  in 

the  algorithm. 

Since  the  language  used  is  PROLOG,  all  constants 
must  be  in  lower  case  letters.  For  example,  the  list 


I 
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of  attribute  symbols  for  relation  R(X)  of  Pig.  3, 
would  be  [a,b,c,d].  If  the  attribute  symbols  in  the 
list  were  capital  letters,  PROLOG  would  interpret  the 
contents  of  the  list  as  unknown  variables.  This 
interpretation  could  lead  to  meaningless  answers  or  to 
the  failure  of  the  entire  program.  It  should  be  noted 
that  the  ordering  of  the  attribute  symbols  is  very 
important.  The  symbols  must  be  in  the  same  order  as 
the  columns  of  the  relation.  For  example,  if  the  list 
for  R(X)  of  Fig.  3  were  changed  to  [b,a,d,c],  the 
algorithm  would  not  generate  the  correct  functional 
dependencies.  The  generated  keys  would  be  invalid  for 
the  relation. 

As  the  FD-Key  algorithm  is  currently  programmed, 
the  relation  to  be  examined  must  be  stored  in  a 
particular  form.  Each  tuple  of  the  relation  must  be 
stored  as  a  list.  For  example,  the  tuple 

<#1  ,bi,ci ,d^  > 

would  be  stored  as  the  list 

[ai,bi,ci,di] . 

Each  list  must  be  followed  by  a  period  or  the  program 
will  not  be  able  to  input  the  data  correctly.  The 
file  ,'*base'  contains  the  lists  that  correspond  to  the 
tuples  of  tne  given  relation.  To  run  the  main 


algorithm,  the  system  must  be  in  PROLOG,  and  the 
compiled  version  of  the  FD-Key  program  must  be 
restored.  For  example,  consider  the  output  of 
Appendix  C  for  the  relation  R(X)  of  Fig.  3.  The 
program  is  called  by  the  predicate 

mainthing([a,b,c,d]) . 

The  argument,  (a,b,c,d],  of  the  predicate  is 
simply  the  list  of  attribute  symbols  for  the  relation 
R(X).  When  this  predicate  is  executed,  the  relation 
in  the  file  'dbase'  is  examined;  the  functional 
dependencies  for  this  relation  are  stored  in  the  file 
'propa',  and  the  keys  of  the  database,  along  with  the 
run-times  for  various  routines  in  the  algorithm,  are 
output  to  the  user. 

If  the  user  wishes  to  generate  the  keys  for  a 
relation  whose  functional  dependencies  are  known,  the 
set  of  dependencies  must  be  stored  in  the  file 
'propa'.  Each  dependency  must  be  written  as  a  logical 
proposition  followed  by  a  period.  Thus,  the 
functional  dependency  A->B  would  be  stored  in  the 
file  'propa'  as 

a»>b. 

To  call  the  key  generation  section  of  the  program,  the 
following  predicate  is  used: 


solve_for_keys([a,b,c,d]) 


In  this  example,  the  argument  [a,b,c,d]  is  the  list  of 
attributes  for  the  relation  containing  the  functional 
dependencies  found  in  the  file  'propa'. 

To  make  any  other  predicate  of  the  program 
available  to  the  user,  a  public  statement  must  be  used 
to  declare  the  predicate,  and  the  new  program  must  be 
compiled.  For  example,  to  declare  the  predicate 

concatenated, Y,Z)  . 
of  Chapter  V,  the  statement 

: -public  concatenate/3. 

must  be  inserted  in  the  program.  The  format  of  this 
statement  is 

:-public  name/arity. 

In  this  statement,  the  name  of  the  predicate  is 
separated  from  the  number  of  its  arguments  by  the 
slash. 

The  flowcharts  of  the  FD-Key  algorithm  can  be 
found  in  Appendix  A.  These  flowcharts  are  written  at 
a  level  which  will  enable  the  user  to  translate  the 
algorithm  to  another  language  if  PROLOG  is  not 
available.  The  PROLOG  program  of  the  FD-Key  algorithm 
is  found  in  Appendix  B.  This  program  contains 
numerous  comments  designed  to  explain  each  set  of 
clauses.  Generally,  the  purpose  of  the  set  of  clauses 
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and  a  brief  example  is  contained  in  each  comment. 
Appendix  C  is  made  up  of  some  sample  runs  of  the  FD- 
Key  algorithm  for  various  relations.  Bach  run 
contains  the  input  commands  to  the  computer,  a  listing 
of  the  input  relation,  a  listing  of  the  keys  for  the 
relation,  and  a  listing  of  the  functional  dependencies 
generated  by  the  algorithm. 


CHAPTER  VII 


PROPOSALS  FOR  FUTURE  WORK 

There  are  a  number  of  ways  in  which  the  present 
FD-Key  routine  might  be  improved,  i.e.,  made  more 
efficient  or  extended  in  application.  Three  such 
improvements  are  detailed  below. 

Smsiszmfist  af  as  inferential  processor.  Since  the 
key  generation  routines  depend  upon  two  separate 
calculations  of  the  Blake  Canonical  Form,  a  processor 
capable  of  performing  this  calculation  in  hardware 
would  be  very  advantageous.  Fortunately,  an 
inferential  processor  has  been  proposed  [8]  that  can 
generate  the  Blake  Canonical  Form  very  quickly.  This 
device  receives  a  sum  of  terms  formula  from  a  host 
computer  and  outputs  the  Blake  Canonical  Form  of  this 
sum  to  the  host.  By  using  this  type  of  device,  a 
large  portion  of  the  key  generation  program  could  be 
replaced.  A  revised  algorithm  to  generate  the  keys  is 
presented  below. 

(i)  Input  the  set  S  of  functional 
dependencies  for  the 
relation. 

(ii)  Express  the  set  S  as  a  Sum  of 
Products  formula  F. 
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(iii)  Add  the  term  which 
corresponds  to  the  list  of 
attributes  to  F. 

(iv)  Output  F  to  the  hardware 
processor. 

(v)  Input  B  C  F  { F )  from  the 
processor. 

{ vi )  The  set  of  keys  K  corresponds 
to  the  terms  of  BCF(F)  that 
contain  only  uncomplemented 
literals. 

This  key  generation  algorithm  clearly  reduces  the 
amount  of  software  used,  and  consequently  the  cost  of 
processing  and  the  computer  time  required  would  be 
reduced.  By  using  this  technique,  the  algorithm  may 
be  speeded  up  to  be  used  in  a  real-time  data 
processing  situation.  Another  method  of  speeding  up 
the  key  generation  routine  is  presented  in  the  next 
section. 

Mttitiryaiacd  dependency  gene-radon.  As  the  FD-Key 
algorithm  is  presently  formulated,  only  functional 
dependencies  of  a  relation  are  manipulated.  The 
inclusion  of  the  information  contained  in  the  multi¬ 
valued  dependencies  of  the  relation  would  speed  up  the 
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key  generation  routine.  However,  two  problems  remain 
to  be  solved  before  this  improvement  can  be  made. 
First,  an  algorithm  to  generate  the  multi-valued 
dependencies  from  the  relation  would  have  to  be 
developed.  Secondly,  an  algorithm  to  translate  these 
dependencies  into  a  sum  of  products  form  would  have  to 
be  created.  After  these  problems  are  overcome,  the 
new  SOP  formula  could  be  added  to  the  SOP  formula  for 
the  functional  dependencies,  and  the  resulting  Blake 
Canonical  Form  of  this  formula  may  provide  some 
additional  information  not  contained  in  the  original 
Blake  Canonical  Form  for  the  functional  dependencies. 

£l23iam  madiiisatiana.  Another  area  of 
improvement  would  be  to  change  the  PROLOG 
implementation  of  the  FD-Key  algorithm.  The  new 
PROLOG  program  would  be  different  in  two  ways.  First, 
the  routines  that  manipulate  the  Boolean  formulas 
would  be  changed.  These  new  routines  would  work 
directly  on  the  sum  of  products  or  product  of  sums 
formulas  instead  of  a  list  of  lists.  This 
modification  would  not  only  speed  up  the  data 
manipulations,  but  it  would  also  remove  the  routine 
used  to  parse  a  formula  into  a  list  of  lists. 
Although  this  modification  would  involve  major 
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revisions  in  the  program  routines,  a  large  saving  of 
run-time  should  be  realized. 

A  second  way  to  improve  the  efficiency  of  the 
program  would  be  to  keep  as  much  data  as  possible  in 
fast  memory,  that  is,  not  to  store  data  in  files  for 
later  use  in  the  program.  By  keeping  the  data  in  fast 
memory,  the  data  retrieval  time  will  be  very  short  and 
the  algorithm  would  execute  more  efficiently.  This 
modification  would  also  require  major  revisions  in 
many  of  the  procedures,  but  the  run-times  should  be 
quicker. 

Scrmaj  form  generation.  A  final  area  of  future 
work  might  be  the  development  of  an  algorithm  to 
produce  normal  forms  of  a  relation.  If  an  algorithm 
to  generate  both  functional  and  multi-valued 
dependencies  existed,  a  method  to  generate  the  normal 
forms  of  a  relation  based  upon  this  algorithm  could  be 
produced;  The  development  of  this  normalization 
routine  should  be  a  very  straight-forward,  since  the 
normal  forms  of  a  relation  are  generated  by  examining 
the  keys  and  dependencies  that  are  associated  with 
that  relation. 


CHAPTER  VIII 


CONCLUSIONS 

The  purpose  of  this  thesis  was  to  present  an 
algorithm  capable  of  generating  the  keys  and 
functional  dependencies  of  a  relational  database.  The 
feasibility  of  this  algorithm  was  demonstrated  by 
implementing  it  with  the  computer  language  PROLOG. 

The  key  generation  algorithm  is  based  on  a 
theorem,  due  to  Sagiv,  concerning  the  equivalence  of 
logical  propositions  and  functional  dependencies. 
This  theorem  allows  the  formidable  problem  of  key 
generation  to  be  solved  by  techniques  of  Boolean 
analysis. 

An  algorithm  based  on  partitioning  was  then 
developed  to  generate  the  functional  dependencies  of  a 
database.  These  two  algorithms  were  combined  to  form 
the  FD-Key  algorithm,  which  was  implemented  using  the 
logic-programming  language  PROLOG. 

This  implementation  involved  many  different  uses 
of  propositional  logic.  Boolean  analysis,  and  the 
Blake  canonical  form  for  the  actual  generation  of 
functional  dependencies  and  keys  for  a  relational 
database.  The  execution  of  the  FD-Key  algorithm 
imposes  a  distinct  set  of  computer  system 


65 


66 


requirements.  These  requirements  were  presented  and 
some  actual  executions  of  the  FD-Key  algorithm  were 
given  as  examples. 

The  FD-Key  algorithm  provides  a  convenient  method 
for  generating  the  keys  and  functional  dependencies  of 
a  database.  This  algorithm  has  produced  a  solution  to 
a  difficult  and  complex  problem  of  relational 
databases,  namely  the  identification  of  the  keys 
necessary  to  access  the  information  stored  in  the 
database,  by  using  the  techniques  of  propositional 
logic  and  Boolean  analysis. 


APPENDIX  A 


FLOWCHARTS  FOR  THE  FD-KEY  ALGORITHM 
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7a. 

Fig.  7.  Functional  dependency  routine  (mainthing) 
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7b 
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Fig. 


.  Routine  to  generate  list  of  unique 
data  items  (getdatalist) . 
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Fig.  9.  Routine  to  generate  the  list 
of  P^*  partitions 
(listofparts) . 
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Fig.  10.  Routine  To  Generate  The  List  Of 

Two-block  partitions  (genblocksl) . 


6 


Pig. 


11a. 

11.  Routine  To  Create  n  Copies  Of  The 


Original  Relation  R  (formpartitions) 


REMOVE  MTH  DATA 
VALUE  PROM  T1  AND 
PLACE  THIS  NEW 
TUPLE  Tl'  INTO 
THE  SUB-RELATION 
R-n 


REMOVE  MTH  DATA 
VALUE  FROM  Tl  AND 
PLACE  THIS  NEW 
TUPLE  Tl'  INTO 
THE  SUB-RELATION 

Rj2  _ 


/  IS  Tl  \ 
THE  LAST  TUPLI 
SOP  DBASE, 


Tl  »  NEXT 
TUPLE  OF 
DBASE 


/IS'' 
i  ■  n 

v  ? 


i  *  i+1 


X  »  ATTRIBUTE  SYMBOL, 
j  »1,  n  »  #  OF 
COPIES  OP  THE  RELATION 


INPUT  FIRST  TUPLE 
T1  FROM  R-n 


INPUT  FIRST  TUPLE 
T2  FROM  R,, 


Cl  «  T2 

•? 


LET  S  -  THE  SUM  OF 
ATTRIBUTE  SYMBOLS  THAT  HAVE 
DISTINCT  DATA  VALUES  IN 
THE  TUPLES  Tl  AND  T2 


INSERT  S  AS  A  PRODUCT  IN 
THE  PRODUCT  OF  SUMS  EJ 


NO 

FUNCTIONAL 

DEPENDENCIES 

EXIST 


12a. 


Fig.  12.  Routine  to  test  for  functional  dependencies  and 


generate  the  POS  formula  EJ  (formf unction) 


m 


IS  T2 


12b 


Pig. 


13.  Routine  to  generate  a  sum  of  products 


formula  from  a  product  of  sums  formula 
(convertpos) . 


81 


Fig.  14.  Routine  to  generate  the  functional 
dependencies  from  a  list  of  lists 
(forradeps) . 
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15a. 

Fig.  15.  Key  generation  routine  (solve_f or.keys) 
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15b 
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Fig.  16.  Routine  to  convert  implications  to 


sum  of  products  formula  (doitl) 


17a. 


Fig.  17.  Routine  to  parse  a  sum  of  products  formula 
into  a  list  of  lists  form  (parseit) . 
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17b. 


IMUltfifHlMMiitiaiWMh* 


8: 


Fig.  18.  BCF  routine  for  a  sun  of  products 


fornula  F  (bcfs) 
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Fig.  19.  Routine  to  generate  a  list  of 
key  lists  ( f ind_the_keys) . 


APPENDIX  B 


PROLOG  PROGRAM  FOR  THE  FD-KEY  ALGORITHM 


-public  f ind_the_keys/3*keys/l . 
-public  solve_for_keys/l  »mainthinS/l 


/*************************************************/ 


/*  */ 

/%  These  are  the  operators}  *  +  •  is  losic  0R»***  %/ 

/*  is  losic  AND  *  */ 

/*  *'•  is  losic  NOT»and  ■*>•  is  */ 

/*  losic  implication.  */ 

/%  %/ 


/  A  W  ^  ^  ^  «V  ’V  ^  / 

f  ^  ^  ^  ^  'S  ^P  ^  ^  ^  ^  ^  *S  ^  /p  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ip  ^  ^  ^  ip  ^  ^  ^  ^  / 

:  -op  <  900  f  :<  f  x  c  => ) . 

: -op ( 690  *  y  f  x » + )  . 
t -op  <  600  *  y f x  f 8 ) . 

: -OP < 500 » xf » ' )  . 


/*************************************************/ 


/*  */ 

/*  mainthins<  A* A)  is  a  routine  to  time  the  */ 

/%  functional  dependency  seneration  alsorithm  and*/ 
/*  the  key  seneration  alsorithm.  mainthinS  also  */ 
/*  calls  the  routines  to  Senerate  the  functional  */ 
/*  dependencies  and  the  keys  for  a  relation.  */ 

/*  A  is  the  list  of  attribute  symbols.  */ 

/*  */ 


/*************************************************/ 

mainthinS ( CX ! LI ) • -time0<T>  t 

mainthinsl (CX!L3»CX!L3) » timeO ( T1 ) r 

close(propa) » Time  is  Tl-T* 

write('Time  for  functional  dependency  '  >» 

write< 'Seneration  is  '  )* 

nl t write (Time) > write ( "ms' ) »nl t 

sol ve_for_keys ( CX ! LI ) f clearf i les 1 . 

/*************************************************/ 


/*  */ 

/*  clearfilesl  is  a  routine  to  delete  the  data  */ 

/*  files  datf  list,  and  blake*  */ 

/*  */ 


/*************************************************/ 

clearfilesl J-see<dat>  t rename (dat » C3  >  r see< list) » 
rename < list* C3 ) t 
see(blake) » rename<blake» Cl ) . 
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/*  */ 

/*  mainthingl (Ar A)  is  a  routine  to  generate  all  */ 
/#  of  the  functional  dependencies  of  the  relation*/ 
/*  stored  in  the  file  dbase,  A  is  the  list  of  */ 

/*  attribute  symbols*  */ 

/*  */ 

/*************************************************/ 

mainthingl <  Cl t CP ! Q3 ) *- ! . 

mainthingl  (CXiL3»CPiQ3>  ‘-getdatal ist(X>  CPiG3fCZ!K3>r 
listofparts(CZ!K3fO>N) r 

genblocksl  (N»l»l»CHiT3)  »numattr ( X»CP!Q]»P1>» 
formpartition<  CH ! T3  »P1 ) t removec  <  X »  CP ! Q]jCH1!T12)» 
length <  CH ! T3  »M) » <formf unction ( CHI !T13»M»1»X)» 
convertPOs(X) »formdeps(X) »true> ,clearfiles(M»X)» 
mainthingl <L» CP IQ3) » 


/*************************************************/ 


/*  setdatalist<X»L>Ll »M>  is  a  routine  that  */ 
/*  returns  the  list  LI  of  unioue  data  values  */ 
/*  found  in  the  column  X  of  the  database  */ 
/*  stored  in  the  file  dbase,  The  list  L  of  */ 
/*  attributes  for  the  database  must  be  given  */ 
/*  along  with  the  length  M  of  this  list,  L  */ 


/*  includes  the  attribute  X,  This  list  must  be  */ 
/*  in  the  same  order  as  the  columns  of  the  dbase  */ 
^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  /fc  t  )|t  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^ 

getd3talist(X»CY!L3»CZ 1K3> :~numattr(Xf CY;L3»N) , 
see (dbase) t 

getdatal < N  t CH ! T3 ) modups <  CH i T3  r  CZ I K3  > r seen . 

/*************************************************/ 
/*  numattr(X»L?Z)  is  a  routine  that  returns  N  the*/ 
/*  numerical  position  of  the  attribute  X  in  the  */ 
/*  list  of  attributes  L«  */ 

/*************************************************/ 

numattr(Xf CY !L3 * N) ‘-X»Y»N*1 » 

numattr(X»CY!L3»N) :-numattr<X»L»Nl) fN  is  Nl+i. 
/*************************************************/ 


/*  getdatal (N»H)  reads  a  tuple(row)  from  dbase  */ 
/*  and  returns  H  all  the  data  values  found  in  */ 
/*  column  N  of  the  database,  */ 
/*  */ 


/*************************************************/ 


Jt 


setdatal <N.P> i-read(X) .setdata2<N.X.P> . 
/ft************************************************/ 


/*  Setdata2(N. B.C)  determines  if  there  are  no  */ 
/*  more  rows  in  the  database  and  returns  C»  an  */ 
/*  empty  list  if  this  is  true.  Otherwise,  the  #/ 
/*  nth  data  value  of  the  tuple  B  is  placed  into  */ 
/#  the  list  of  values  C  and  another  row  is  */ 
/%  obtained  by  Setdatal.  %/ 


/##*#######*******#*#*#*#*#**#*#*#**************#*/ 

setdata2(N.end_of_f i le » C  3  >  ♦  -  !  » 

Setdata2(N.X.CY!T3) • -nthinl ist  <  N.X.Y.l) . 
setdatal <N.T) » 

/*********************#*######*****#**#***********/ 


/*  nodups<L. LI ) »  removes  any  duplicate  data  */ 
/*  values  from  the  list  L  and  returns  this  */ 
/*  revised  list  LI.  */ 
/*  */ 


/ft************************************************/ 
nodups ( C  3 » C  3  >  . 

nodues<CHST3.  CZ5K3)  :-me»ber<H.T)  .nodupsd. CZIK3) . 

nodups ( CH J T3.CH! K3 ) :-noduPs(T.K)  . 

/  )|(  ^  ^  ^  ^  ^  ^  ^  ^  >|^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^ 
/*  nthinlist <N.L» Y»M>  returns  the  nth  data  value  */ 
/*  Y  in  the  tuple  L.  M  is  a  counter.  */ 

/I**#***#**********************#*#***#*****#*******/ 

nthinlist <N. CX ! L3 » X »M ) J-M-N. 
nthinlist<N.CX! L3.Y.M)  J-Ml  is  M+l. 
nthinl istCN.L.Y.Ml ) . 

/ft************************************************/ 
/*  %/ 

/*  listofparts< J»H>  inputs  a  listof  uniaue  data  */ 
/*  values,  and  outputs  N  the  number  of  partitions*/ 
/*  of  this  data  list  stored  in  the  files  p1....pN*/ 
/*  where  N  is  a  number  to  be  determined.  */ 

/*  */ 

/it************************************************/ 

1 istof parts < J.N.M ) J-lenath< J» Y) .Y1  is  Y+l. 

N1  is  Yl/2. 

formblockl <0»N1»J»X»K1>»  testn<N»  CX.K1 3 .M) . 
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/ft************************************************/ 
/*  */ 

/%  1 istof parts! (Mr CX ! L3 rC)  inputs  a  number  N *  a  %/ 
/#  Partition  CXIL3  in  a  list  of  lists  form*  and  #/ 
/*  outputs  a  list  of  lists  C  which  is  a  new  */ 

/*  partition  of  CX!L3.  Every  list  X  in  CXJL3  */ 
/*  will  have  been  separated  into  two  lists  and  */ 
/*  inserted  as  two  lists  in  the  list  of  lists  C.  #/ 
/%  */ 

/****************t*******t************************/ 

listofpartsl (N»C3»C3) . 

1 istof partsl <N»CH!T3»CX1»K2!U1> * -lenSth ( Hr  Y ) r 
Y1  is  Y+lr 

N1  is  Yl/2r formblockl <0rNlrHrXlrK2) rN2  is  N  +  lr 
1 istof part si <N2rTrU> . 

/****#*#*#****####**##***#*##****##***********###*/ 
/%  */ 

/%  formblockl (ZrNrLrLl rK)  inputs  the  value  of  a  %/ 

/*  counter  Z r  a  list  of  data  values  Lr  and  */ 

/*  outputs  two  new  lists  Li  and  K  which  consist  */ 
/*  of  a  partition  of  the  list  L.  The  number  of  %/ 
/*  of  elements  reauired  to  be  in  the  list  LI  is  */ 
/#  input  as  the  number  N.  %/ 

/*  */ 

/a******************#******#*#***#******#**#******/ 

formblockl (ZrZrKl rCDrKl).-! , 
formblockl(Z»Nl»CJ!KD»CJ!L3fKl) t-Zl  is  Z+lr 
formblockl (ZlrNlrKrLrkl). 

/*  */ 

/*  testn  is  a  routine  to  call  writepart.  #/ 

/*  */ 

/*************************************************/ 

testn(OrHrM)  *-writepart<0»H»M> ♦ 
testn<Nr  CH ! T3  rM> J -writepart (Nr  CH ! TD  rM> . 

/*t************t******************t********tt*****/ 
/*  */ 

/*  writepart(NrL)  is  a  routine  that  writes  the  %/ 

/%  partition  L  in  the  file  p(N+l>.  N  is  an  */ 

/*  input  number  and  L  is  a  list  of  lists.  This  */ 
/*  routine  also  calls  testh(ArB).  */ 

/*  #/ 

/%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%/ 
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writepart(NrHrZ) *-Nl  is  N+l  mame(Nl  rM>  r 
concat ( C1121 rMrP) » 

nan® (PI fP)»tell(Pl)»write<H)»write(,./>» 
told. testh(HrNlrZ) « 

/***********»*****#**##**#****************#*****###/ 
/%  %/ 

/%  testh< CH ! T1 r N)  inputs  a  number  N  and  a  X/ 

/%  partition  CHiTOr  which  is  in  a  list  of  lists  X/ 
/X  form*  N  is  the  number  associated  with  the  file X/ 
/%  pN  where  CH1T3  is  stored*  The  length  of  the  X/ 
/%  list  H  is  tested.  If  the  length  is  less  than  X/ 
/%  or  ectual  to  two*  then  the  routine  testt(TrN)  */ 

/%  is  called.  Otherwise*  the  N+lth  partition  is  X/ 
/X  formed  by  calling  1 istofpartsi .  X/ 

/X  %/ 

/XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/ 

testh( CHJTlrNrM) :-length(Hr Y) * 

( Y=<2  r testt <  T  r  N  r  M  > } 

1 istofpartsi <NrCH!Tl*G)r  testn  <  N  *  G  r  M ) ) . 

/ xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx / 

/X  X/ 

/%  testt(TfN)  inputs  a  list  of  lists  T  and  a  */ 

/*  number  N.  If  T  is  an  empty  list  the  routine  */ 

/%  succeeds*  otherwise  testh(TrN)  is  called.  %/ 

/%  */ 

y  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^ 

testt(C3*N*N) :-! . 

testt <T  »N*M) J-testh(T  »N»M) . 

/*#**#**###******:M**********#********##**#***#***/ 
/*  %/ 

/X  senblocksCAr Br Cr D)  and  genblocksl (ArBrCrD)  X/ 

/X  are  routines  to  form  a  list  D  of  two-block  */ 

/*  partitions  that  are  maximally  skewed.  A  is  */ 

/*  the  number  of  partitions  and  both  B  and  C  are  X/ 

/*  counters.  */ 

/*  */ 

/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ 

genblocksl (ZrZrMr CX3 ) J -getname ( Z * B ) rsee(B)  * 
read(X) rseenrtest2(X) . 

genblocksKZ*  Nr  Hr  CX.’LO) : -getname < Nr B)  rsee(B)  r 
r®ad(X) rseenrgenblocks(ZrNrHrL) . 
g®nblocks<Z*Nr Hr  CX !  LD )  i-setname<Nr B)  r see( B )  r- 
read(Y) rs®«nr 

g®taPart<ZrNrlrTrY)/T»CX3r  <Z*NrL»C3JNl  is  N  Mr 
g®nblocks(ZrNl r 1 rL>  > . 
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3enblocks(ZrNrMr  L>  :-3etn3ine(N> B)  *see<B>  t 
read<X)  »seen»aet3Part<Zr  NrMrLrX) » 

/  Sc  )j(  jjt  sc  sc  %%  ?|(  sc  jj(  j|(  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  jji  ^  ^  ^  ^  ^  ^  ^  ^  ^  sc  Sc  Sc  <j(  *|^  *^j*  ^  y 
/%  %/ 

/%  test2< CX ! L9 )  is  a  routine  to  test  if  the  %/ 

/%  length  of  each  list  X  in  CXIL3  is  less  than  2  #/ 
/*  */ 

/%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%/ 

test2<C3) :-! . 

test2<CH!T:> :-lenath(H»Y>  » ! r Y<2rtest2<T> . 

^  ^  ^  ^  ^  ^  ^  ^  sc  Sc  Sc  Sc  Sc  sc  /^c  Sc  Sc  Sc  Sc  Sc  ^^c  sc  Sc  )|c  Sc  Sc  Sc  /|t  ^c  s^  ^  Sc  ^fc  s  ^c  sc  ^^c  Sc  >^c  ^ 
/%  %/ 

/%  aetapart( A> Br CrDrE)  returns  E  a  maximally  #/ 

/#  skewed  two-block  partition.  */ 

/*  #/ 

/*«*««*****4(*!|C****«N(«3tC*««**N(«******!t(N(*«*****«*««*«/ 

aetapart <  Z»N » 2r  C  3  r X) ♦ - ! . 

aetapart (ZrNrMrCXl !L13»CX!L3> *  -Ml  is  M+lr 
aetblock (ZrNrMlrCXllrXrL) r 
N1  is  N+l f aetapart (ZrNlrMlrLlrL) • 

/%  %/ 

/t  aetblock  <  A  *  B  r  C » D  ?  E  r  F )  and  aetblockl  ( A  ?  B » C r D  r  E )  %/ 
/*  generate  a  block  of  the  partition.  %/ 

/*  */ 

/##*#**#**######***##*###*#####***##*****#******#*/ 

aetblo9k(ZfN»M»XVCH!T3»C3) :-lenath(CH!TlrY) , 

Y1  is  Y+l »N2  is  Yl/2r 

aetblockl  ( CH.'TD  cOf  N2»WrV>  »concat(  CW1 » CVUrX) . 
aetblock (Zr Nr  M rCX3rCH!T3rCHl !  T1 3 ) J  - 
lenath<  CH ! T3  r  Y ) r  Y1  is  Y+lrN2  is  Yl/2r 
aetblockl<CH!T3rOrN2rWrV>r 
aetblock<Zr NrMr CXI !L13rHlrTl)r 
for»some<CWrV3rCXl :i_13rX> . 
detblockl(HrN2rN2rC3rH> J-!  . 
aetblockl(HrNrNrXrH) :-! . 

aetblockl <CH!T3rCrN2rCH'T13rV>.-Cl  is  C+lr 
aetblockl < Tr Cl rN2»TlrV> , 

/#*****#*****;'  ^#*****************^******«********/ 
/*  %/ 

/%  aetname(NrB)  is  a  routine  to  form  the  file  */ 

/%  name  B*pN»  where  N  is  a  number.  #/ 

/*  */ 

/**********jM*iM*******ft**************************/ 
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Setname(NrB) J-naae(N>M) >concat < Cl 127 V ) r 
na®e(B» V) > ! » 

/************#***#*******************#*********«**/ 


/#  */ 

/#  fom9onie(AfBrC>  forms  a  partition  C  from  two  #/ 
/*  blocks  A  and  B.  */ 

/%  */ 


/**##*#***##******###***##****#**##*##***#*#***#**/ 

formsome<CU»V]»CX3»Z> : -f ormsome ( CU» V] »X»Z) , 
formsome<CWfy3t CA»B7»CX2»L2D) :-concat<Uf A»X2) , 
concat( V»B*L2> . 

/######*##**#**#*####**#*###**###****##**##***#*#*/ 


/#  */ 

/*  formpartition(Xf Y)  inputs  a  list  X  of  Y  two-  #/ 
/#  block  partitions.  Each  partition  is  a  list  #/ 
/*  containing  two  lists  or  blocks.  This  routine  */ 
/*  will  form  2Y  files  such  that  each  file  */ 

/#  corresponds  to  a  block  in  a  partition.  Hencer*/ 
/*  Y  copies  of  the  file  dbase  will  be  created.  */ 
/*  These  new  relations  will  only  contain  data  */ 
/*  values  not  found  in  the  partitions.  #/ 

/%  %/ 


/*#***********##**#*********#*#********#**********/ 


forn*partition<  CX  !  L3  »P>  .-see (dbase)  ? 
f ormeartsl ( CX ! L3  » 1 » P ) * 
seen? told. 

/****####**##*##**##**#**#####*##*#***##***#*#*###/ 


/*  */ 
/*  formpartsl  <  CX '  L3  ?  RrF’>  performs  two  different  */ 
/#  tests  on  a  row  from  the  relation.  Test  1  */ 

/*  determines  if  the  last  row  in  the  dbase  has  #/ 


/#  been  processed.  A  true  response  will  initiate#/ 
/*  test  2.  A  false  response  will  call  the  routine#/ 
/*  parti.  Test  2  determines  if  the  last  partition*/ 
/*  has  been  processed.  A  true  response  will  */ 
/*  terminate  the  routine.  A  false  response  will  #/ 
/*  increment  the  counter  R  and  call  formpartsl.  #/ 
/*  */ 
/*#**#*#**##**#**#*#***#***##*##***#*******#******/ 
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formpartsl<CXSL3»R»P> :-read(G) » 

<Gaend_of _f ilef  <L=C3»R1  is  R+l» 
close<dbase>  rtae(dbase)  » forme artsl  (L.R1  »P>  >  i 
parti (X, CX ! L3 »G»Rf 1 »P > )  . 
parti  < CY. ‘ N3 r CX  !  L3  » CA  J  B3 »R » G »P )  :-CZ.‘K3*r» 
coapar*. the- value <Z»CA!B3»P»l)f 
name(R.F)  rname(Q'E)  * 

concat <F»E  »U)  »concat <  Cl 123 » Wr C> > na««( I , C ) » tel  1  ( I ) » 
writ«tuple(Zf CA!B3f CHST3) »write(CH!T3)»write ('.')» 
nl »for»partsl (CX!L3»R»P>. 

/*######*##*#####*#*****#*****#****##**********##*/ 
/*  */ 

/#  parti ( L»L1 >L2» R »G » N )  places  the  tuple  L2  from  #/ 
/*  the  file  dbase  into  the  file  pRQ»  which  is  the#/ 
/*  Qth  block  of  partition  R.  This  is  accomplished#/ 
/#  ba  testing  for  membership  of  a  data  value  in  #/ 
/*  block  Q  of  partition  R»  in  the  tuple  L2  and  #/ 
/*  writing  this  into  the  correct  file.  */ 

/#  */ 

/###***#**#****#########*#########################/ 

partl<CY!N3,CX:L3*CA!B3»R»G*P>  .‘-length  < Y»N1 )  »N1  =  1 , 
partl<N»CX.'  L3»CA'B3»R»2»P)  . 
P3rtl(CY!N3rCX!L3.CA!B3»RfQjP)  J-CZ!K3=Y, 
parti (CK!N3,CX!L3fCA!B3»RFlrP). 


/##########**##########################*########*#/ 
/*  #/ 
/#  writetuPle<Z»L*Ll >  returns  a  list  LI  which  is  #/ 
/#  a  subtuple  of  the  tuple  L  with  the  data  value  #/ 
/*  Z  removed.  #/ 

/*  #/ 
/«####################  ###########################/ 

wri tetuple< Z» CZ ! B3 t B) {-!  . 

wr  ite  tuple  <  Z»  CA! B3.CAJK3)  J-wri tetuple<  Z»BrK  > » 
/*###***#*####*#*#*#####******####**##*###*##**##*/ 


/*  */ 

/#  removec(X»Lf LI )  is  a  routine  to  remove  the  #/ 
/#  attribute  symbol  X  from  the  list  of  attribute  */ 
/*  symbols  L»  and  return  this  new  list  LI.  #/ 

/*  #/ 


/#**######*##*###*##*##*##########***#####*#**##**/ 


removec  <X»C3»C3)J-!. 
removec<X»CX!L3»L> }-! ♦ 

removec<X»CH!T3rCH!L13> J-removec(XfT *L1) . 
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/ft************************************************/ 
/*  */ 
/%  compare_the_value(Z»LrNrC)  determines  if  Z  is  */ 
/*  the  Nth  data  value  in  the  tuple  Lr  using  the  */ 
/%  counter  C.  */ 

/*  */ 

/******************************#******************/ 

compare_the_value(Z>  Cl >N»N)t-fail. 
compare- the- value ( Z ? CZ ! B1 *N>N) i- ! . 
compare_the_value<  Z>  CA ! B1  t N*  Q  > i- 
N>Q»Q1  is  Q  +  If  ! f 
compare- the_ value ( Z»  B»N»  Q1 ) . 

/***#*##*##**##***####**#***#***##*****##*####*##*/ 
/*  */ 

/*  formfunction(A»B»C»D>  is  a  routine  to  form  3  */ 

/*  product  of  sums  formula  which  contains  all  #/ 

/*  information  necessary  to  generate  the  */ 

/*  functional  dependencies  with  attribute  C  on  */ 

/*  the  right  side*  This  product  of  sums  formula  #/ 
/*  is  stored  in  the  file  EC  where  C  is  the  */ 

/%  attribute  symbol*  A  is  the  list  of  ordered  */ 

/*  attribute  symbols  with  C  removed*  B  is  the  */ 

/*  number  of  copies  of  the  relation*  B  is  a  */ 

/*  counter  used  to  access  the  correct  copy  of  the*/ 
/*  relation.  */ 

/*  */ 

/*####*#******##**##**#****#*#*###*****#*#*##***##/ 

formfunction(G*M* N»C ) *-tel 1 <  temp ) r 
gettuplel <N»X) r 
aettuple2(N»Z) » 
f  o  rmeen <X»X»ZrQ>QrM»N*C)» ! ♦ 

/***###**#***#*****#*##**#**##*****##**##**#***###/ 
/*  */ 

/%  gettuPlel(NfX)  returns  X  a  tuple  of  file  pNI  */ 

/*  if  all  the  tuples  have  been  read  X  *  C  3.  */ 

/*  */ 

/%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%/ 

gettuplel ( N » X ) *-name( N»  Z) »name( 1 r Y) » 
concat<Z»Y»W)»concat<C1123fW»M) * 
name(PrM) »see<P) » read < XI ) » 

<Xl*end_of_f ilef X=CD  >seen)X=Xl ) f ! , 


I 


100 


/*************************************************/ 
/*  */ 
/*  3ettuple2(N*X)  is  identical  to  aettuplel  */ 

/*  except  file  pN2  is  accessed.  */ 

/%  */ 
/*************************************************/ 

3ettuple2(Nf Z) .-name<N»X) *name<2* Y) » 
concat (X* Y * U ) * concat ( Cl 123 * W*M ) * 
name<P*M) * see(P) * read<Zl ) * 
<Zl*end_of.file»Z=Cl»seenfZ*Zl) *  ! . 

/**********###**#**##****#**#*#**#**#*****##*#*#*#/ 
/*  */ 
/*  formecin(A»  B*C*  D*E*F*G*H>  is  a  routine  to  test  */ 
/*  for  non-identical  data  values  in  the  tuples  */ 
/*  from  pGI  and  pG2.  Originally  A=B  from  pGI*  C  */ 
/*  is  a  tuple  from  pG2*  D  and  E  are  lists  of  */ 

/*  attribute  symbols  with  the  symbol  H  removed*  */ 
/*  and  F  is  the  number  of  data  files.  If  B=C*  no*/ 
/*  functional  dependencies  exist  and  formeon  will*/ 
/*  fail.  If  the  tuples  are  different*  the  first  */ 
/*  data  values  of  each  tuple  are  tested  for  */ 

/*  eouality.  If  they  are  eoual*  the  rest  of  */ 

/*  tuple  B  will  replace  B  and  the  rest  of  the  */ 
/*  tuple  C  will  replace  C  in  the  next  call  of  */ 
/*  formeon.  If  the  data  values  are  different*  */ 
/*  the  attribute  symbol  associated  with  these  */ 
/*  values  is  stored  as  a  literal  in  a  sum  in  the  */ 
/*  file  temp.  If  there  are  no  otht "  data  values  */ 
/*  in  B  and  C*  testtuple  is  called.  Otherwise  */ 
/*  formeon2  is  called.  */ 

/*  */ 
/****************#***#**#******##*****************/ 

formeon<P*  CX  !L3»CZJK2*Q*CA!B3»M*N*C) 

CX ! L2  =  CZ ! K2  * ! *fail. 

formeon<P*CX,,L.3*CZ.'K3*Q*CAf  B3*M*N*C)  !-X=Z*  !  * 
formean(P*L*K*Q*B*M*N*C> * ! . 
for»eon(P*CX.,L3*CZ!K3*Q*EA!B3*M*N*C)  i - 
write(A) » (L®CD* 

3ettuPle2(N*Zl ) *  testtuple (P*X*Z1 *Q*CA!B1*M*N*C>* 
formean2(P*L*K }CJ»B»M*N*C))*!. 
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/ **************************** ********************* / 
/*  */ 

/*  formeon2<Ar B»C» Dr ErFrGrH)  is  a  routine  similar*/ 
/*  to  formeon.  The  onlw  differences  ara  that  A  */ 
/*  is  tha  comlete  tuple  of  pGlr  B  and  C  ara  parts*/ 
/*  of  tha  tuples  froa  pGI  and  pG2r  and  E  is  tha  */ 
/*  assoc is tad  part  of  tha  attributa  list.  Also*  */ 
/*  if  the  subtuples  B  and  C  ara  eeualr  there  Mill*/ 
/*  be  no  more  literals  placed  in  tha  sum  stored  */ 

/*  in  the  file  temp.  */ 

/*  */ 

/*************************************************/ 

f ormean2<  Pr  CX!L3tCZ'iKl»Q»CA!Bl'*M»N»C)  5- 
CX ! L3»CZ i K3 r write < ' . ' )»nl* 

3ettuple2(N»Zl > r test  tuple  <  P  r  P » Z1  rQrQfllfNrOr  !  . 
formean2<PrCX!L3rCZ!K3rQr CA ! B3 r Mr Nr C > i -X»Z r ' r 
f ornean2 (PrLrKrQrBrMrNrC)r ! • 
formean2(P»CX:L3»CZ!K3»Q»CA:B]»hfN»C) :-write(  '  +  '  >  r 
write(A)f<L=C3»write</./>»nl» 

3ettuple2(N»Zl) »testtuple(P»P»Zl.Qf CA!BDfM,N.C) f 
f o rmean2 <Prl_rKrQrBrMrNrC)> r  !  . 

/************************************************«/ 


/*  */ 

/*  testtuple(ArBrCrDrErFrGrH)  is  a  routine  to  */ 

/*  test  if  there  are  no  more  tuples  in  the  file  #/ 
/*  pG2.  If  this  is  truer  testtuplel  is  calledr  */ 

/*  otherwise  formeon  is  called  with  the  new  tuple*/ 

/*  C  from  pG2.  */ 

/*  */ 


/*************************************************/ 

testtuple  < PrXrZrQrArMrNrC) .-Z3CD  r  Settuplel <Nr XI ) r 
testt'JPlel  (XlrXlrZrQrQrMrNrC)  r  !  . 
test tup le(PrXrZrGrArMrNrC) ♦- 
f O mean (PrPrZrQrQrMrNrC)r  *  . 

/*************************************************/ 
/*  */ 
/*  testtuplel (ArBrCrDrErFrGrH)  is  a  routine  to  */ 
/*  test  if  there  are  no  more  tuples  in  pGI.  If  */ 
/*  this  is  truer  the  file  temp  is  closed  and  the  */ 
/*  routine  andtemp  is  called  to  senerate  the  POS  */ 
/*  formula.  If  this  was  the  last  data  filer  the  */ 
/*  routine  chandeit  is  called  to  delete  an*  */ 

/*  extraneous  sums  in  the  POS  formula.  Then  */ 

/*  formalph  is  called  to  form  the  losical  AND  of  */ 
/*  each  formula  senerated  bw  each  data  file.  If  */ 
/*  this  was  not  the  last  data  filer  the  routine  */ 
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/%  chanseit  is  called  to  absorb  an  extraneous  %/ 
/%  sums  in  the  POS  formula  and  the  next  data  file*/ 
/*  is  examined*  */ 

/*  */ 
/ft************************************************/ 

testtuplel (PrXrZrQr A»M»NrC) *-X=CD»told» 
see ( temp ) ? read  <  Y  > » 
andtemp(N» Y> » <N=M» chanSeit(N) * 
formalph<M»C» e» 1 ) » N1  is  N+l rchanseit (N> * 
formfunction<Q»M»Nl tC))r!« 
test t up lei (P»X»ZrQrA»M»N»C> {- 

SettuPle2(Nf Z1 ) r  f ormeon< P »p »Z1 »Q* A»M»N*C) t  ! . 

/**##***##***#***#*#***#**####********##******#**#/ 
/*  #/ 

/#  andtemp<N»X>  inputs  a  sum  of  attribute  symbols*/ 

/*  Y  from  the  file  tempf  and  calls  andtemp2.  */ 

/%  */ 

/#******#*####***##**##***##:***.**•*#*#***#******##*/ 

andtemp<N»end_of_file) J-!  . 

andtemp<Nf X)  :-setname<N»B>  »tell<B)  »writ-e< 

write<X) *write( ' ) ' ) » read< Y) » andtemp2<Nr Y) t • . 

/*************************;Mc**********************/ 
/*  */ 

/%  andtemp2 (N » X )  writes  X  as  a  product  of  the  %/ 
/%  POS  formula  stored  in  the  file  pN*  %/ 

/*  */ 

/a************************#**##***#**##***********/ 


and temp2 ( N » end_of _f i 1 e ) J - 
nl rseen? told* ! » 

andtemp2(N'X) *-write< ' %' ) >write( ' ( ' )  ? 

write<X) » write < ' ) ' ) » read( Y) randtemp2<N» Y) r  ! , 

/♦a******#***##****#*#*#*#***#***###***#####******/ 


/*  formalph(MfCtefN)  inputs  the  number  of  data  */ 
/%  files  Hf  the  constant  the  attribute  swmbol  */ 
/*  C*  and  tests  if  the  pN  contains  the  last  POS  #/ 
/*  formula*  If  this  is  truer  the  formula  is  */ 
/#  stored  in  the  file  eC.  Otherwise  X  »  the  POS  */ 
/%  formula  and  formalPhl (X»M»C»e»N)  is  called.  */ 

/*  */ 
/%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%/ 
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formalph(M*C» e»N)  J-M=N»Setname<N»V)  rsee<V)  r 
read<X) f seenmame(Cr Z) mane(epK) » 
concat<K»Z»B>»name<L»B>  rtell<L>  »write<X>  > 
write( ' « ' ) f told? ! . 

for*alPh(M»C? m? N) «-setname<Nr V) ? see< V) / read(X) fseem 
N1  is  N+l »formalphi <X»M»C  »e»Nl > »  !  . 

/*************************************************/ 
/*  */ 

/*  formalphl (XfMf C»e»N)  is  a  routine  similar  to  */ 
/*  formalph  except  that  X  is  the  list  containing  */ 
/*  all  of  the  POS  formulas  for  the  pN  data  files**/ 
/*  this  final  list  is  stored  in  eC  if  the  last  */ 
/*  formula  is  in  X»  otherwise  formalphl  is  called*/ 
/*  */ 

formal phi (Q »e »N > i-M=N?3etname<N*W> fsee(U) * 
read<X) » seen* concat (0>XfZ) * 
name  <  C » L ) » name  <  e  * K ) t conca t  <  K * L  t B  >  t 
name<0f B) »tell <D) »abspr<Zf C3  rF>  r 
write (F) » write ( ' ♦ ' ) ftold, ! . 
formalrhl (Q*M*C»e»N) J -Setnaae< N»U) rsee(U) » 
read(X) t seen* concat <Q»X»Z) r 
N1  is  N+l,formalphl<Z*MfC*e*Nl>r ! ♦ 

/*  */ 

/*  chanseit(N)  converts  the  POS  formula  in  the  */ 

/*  file  pN  to  a  list  of  list  form  by  callins  */ 

/*  nparse.  Also?  any  extraneous  lists  are  deleted*/ 

/*  by  the  routine  abspr.  This  new  list  is  stored*/ 
/*  in  the  file  pN  asain.  */ 

/*  */ 

/*************************************************/ 

chanseit(N) J-name(NfZ) » concat (C112D *Z* U) »name<P»U) ? 
see(P) f det(C) »setstr(C*G> *seen»nparse<F*G» C3 ) ? 
abspr(Ff ClfU) f 

tell<P)*write<U)rwrite( ' * ' >  *  told* ! • 
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/ft************************************************/ 
/*  */ 

/%  convertPos(C)  inputs  the  list  X  of  lists  for*  #/ 
/#  of  all  the  POS  formulas  stored  in  the  file  eC*#/ 
/#  The  list  X  has  all  extraneous  lists  absorbed  */ 
/*  by  the  routine  abspr.  This  new  list  Z  is  #/ 

/#  stored  by  the  as  a  POS  formula  Q  in  the  file  */ 
/%  eC.  This  formula  Q  is  translated  into  a  SOP  #/ 
/#  formula  by  simp<Q*K>.  Then  K  is  stored  in  the  #/ 
/#  file  eC.  The  ascii  code  for  K  is  input  and  #/ 
/#  converted  to  a  list  M  of  lists  form.  Then  the#/ 
/#  extra  lists  are  absorbed  by  abspr (W* Cl *V> .  #/ 

/#  Finally*  the  list  Y  is  stored  in  the  file  eC.  */ 

/*  */ 

/  4  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  Ja  ^  ^  ^  ^  a.  ^  ^  ^  ^  ^  ^  ^  d.  ^  ^  ^  ^  d.  ^  d.  ^  d.  / 

r  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  v  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  / 

convertPos(C) :-name(C»M) *name(e*N)  »concat(N*M*P> * 
name(L»P) *see<L> »  read(X) * 
seen* abspr <  X*  C 1  *  Z  >  *  tell <L) » writeeo( Z) » 
told* see (L) *  read(Q) *seen* 
simp(Q*K) * tell (L) *write(K  ) *  told* 
see(L) *aet (S) *setstr <S»N1 >  *seen* 
parse(U»Nl*Cl) » abspr (W*  Cl* V) * tell (L) *write<V) * 
writet 7 . 7 ) *  told* ! » 

/#*#**#*###»####*####*##****####*###*#*#*#####*###/ 


/#  #/ 

/#  writeea<A>*  writeeai (A>*  writeeo2(A>  inputs  a  */ 

/*  list  A  of  lists  and  stores  a  POS  formula  in  */ 
/*  the  file  eC  that  corresponds  to  this  list.  #/ 

/#  #/ 


/**#*#*#*##**#########*#*#################*##**###/ 

writeea<  CX • L3  > l -write ( 7 < 7 ) * CY ! KD=X» write (Y ) * 
writeeai <K) * writeea2(L> * ! . 
writeeol (C3)J-write(7)7)*!. 

writeeai <CYlK3>«-write<7+7)»write<Y> * writeeai (K) » ! . 
writeea2<  C3 ) i-write< 7 . 7  >  * ! . 

writeea2<CXiL3):-write<7*7  >»write< 7 < 7 > * CY i K3»X* 
write<Y) * writeeai <K) * 
writeea2<L) * ! * 

/t#*#****#************#******#*#**##*#*#**###*#***/ 
/*  #/ 

/*  Npsrse<Z>  returns  Z  a  list  of  lists  for  a  POS  #/ 
/#  formula*  */ 

/*  #/ 

/#*****************#***####**#*#*»#«####«#*###****/ 
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nparse<Z)  — >  nterm(X) » *  a* »nrestp3rse(XrZ) . 
nparse( CZ3 ) — >nterm(  Z ) . 

/*************************************************/ 


/*  */ 

/%  Nterm(Z)  converts  a  term  in  the  POS  formula  */ 

/%  into  a  list*  %/ 

/*  %/ 


/%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%/ 

nterm<Z>  — >  ndelimf ntoken(X) » ndel im k nresterm < X » Z ) »  !  . 


/%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%/ 
/%  %/ 

/%  Ndelim  removes  the  following  symbols  from  the  %/ 

/%  list  created  by  nterm>  &»+>'»<»).  #/ 

/*  */ 

/a*#*********#*****#******************#***********/ 


ndelim 

ndelim 

ndelim 

ndelim 

ndelim 


!  . 


— >  *>■»!♦ 
— >  C3f  !  ♦ 


/*:M***********:M***********************#*********/ 


/*  %/ 

/%  nresterm<X» Y)  sets  the  first  element  of  list  %/ 
/*  Y  to  the  element  %/ 

/*  X»  and  calls  nterm  to  find  the  rest  of  the  %/ 

/*  list  Y.  */ 

/*  */ 


/ft************************************************/ 


nresterm<X»CXiRl>  — >  nterm(R). 
nresterm<X»CX3>  — >  Cl. 

/*t******t*******tt************t******************/ 
/*  */ 
/*  ntoken<Y>  returns  an  atom  Y  for  a  member  of  */ 
/%  the  ascii  list.  %/ 
/*  %/ 
/%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%/ 


ntoken<Y)  — >  CX3»C961»-Cname<Y»CX»?A3>>»  !  . 
ntoken<  Y)  — >  CX3» <name<  Y, CX3 ) t Y\««'  *'>f !  . 


i 


r 


I 


/♦a***********************************************/ 
/*  */ 
/*  nresparse<X»Y>  sets  the  first  element  of  the  */ 
/*  list  of  lists  Y  to  be  the  list  X»  which  */ 
/*  corresponds  to  the  first  term  of  the  POS  */ 
/*  formula*  nrestparse  then  calls  npar se.  */ 
/*  */ 
/*************************************************/ 

nrestparse <X» CX ! R3 )  — >  nparse(R). 
nrestparse<Xf  CX3  >  — >  Cl. 


/*  */ 

/*  formdeps(C)  writes  all  the  functional  */ 

/*  dependencies  with  attribute  C  on  the  risht  */ 

/*  side  that  exist  in  the  relation  into  the  file  #/ 
/*  propa.  */ 

/*  */ 

y  ^  ^  ^  ^  ^  ^  ^  klj  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  Ua  ^  ^  ^  ^a  ^a  ^  ak  ^  ai.  aja  ^  Ja  y 

f  *  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  f 

formdeps<C) .-Setlist (C* CX IL3) » 
tell (propa) pformdepsl <C» CX ! L3 ) ♦ 

/******#****#*******#***###*****#*****************/ 
/*  */ 

/*  formdepsl <C»L)  inputs  an  attribute  C  and  a  */ 
/*  list  of  lists  L.  L  is  a  list  of  all  the  left  */ 
/*  sides  of  the  functional  dependencies  in  the  */ 

/*  relation.  This  calls  Setterm  and  itself  until*/ 
/#  until  L  is  an  empty  list.  */ 

/*  */ 

/%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%/ 

f ormdepsl <C»C3).-». 

formdepsl <C>  CX ! LI ) • -set term ( C  r X ) t formdepsl ( C»L> . 

/*************************************************/ 
/*  */ 

/*  3etterm<C*L)  inputs  C  the  risht  side  attribute*/ 
/*  symbol  and  a  list  L  of  attributes  for  the  left*/ 

/*  side  of  a  functional  dependency.  */ 

/*  */ 

/**#**********************************************/ 

Setterm<C»C2!K]>  J-K»C3fwrite(Z) » 
write </»>'>fwrite(C)fwrite<,.')»nl. 
S»tterm(C»CZ!K3) 5 -write< Z ) »write< 'l' ) »Setterm<C»K) . 
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/*##*###*###*####*##*######*#**##*##*###*#####*#*#/ 
/*  */ 

/#  aetlist(CfL)  inputs  an  attribute  C  and  returns#/ 

/#  a  list  of  left  sides  for  functional  #/ 

/#  dependencies  that  was  stored  in  the  file  EC.  #/ 
/*  */ 

/###**##****#**#**#####***#**###*###*####*#*##**#*/ 

set list <C  fX) t-name<C»G) f concat <  C101 1 »G»  V ) r 
name(M»V) »see(M) »  read(X) r seen* 

/#################################################/ 


/#  */ 

/%  clearfiles  and  nosubscripts  are  routines  to  */ 

/*  delete  the  data  files  used  in  the  functional  #/ 

/#  dependency  generation  routine.  */ 

/#  #/ 


/#*#***#*#*####**###p##*#######*##*#*##*###*##*###/ 

clearf  iles<N*X)  .’-name  <  XfZ) » concat  (  C 101 1 1 Zf  U )  t 
name(Ul fW)ftell(Wl)» rename (U1 r C3) t 
tell ( temp) r rename ( temp f CO  > »nosubscripts(N» 1 ) . 
nosubscripts<N»M) *-name(H»Ml ) r 

concat ( Cl 121 >M1 f M2) Fconcat <M2f C491 * M3) t 
name<M4f  M3 ) » tel 1 ( M4 ) » rename (M4f  C  3 )  f 
concat <M2f C501 fM5) Fname(M6FM5) » 
tel  1  ( M6 )  ?  rename <M6f  Cl )  mame<GrM2) » 
tell <G) » rename<Q» Cl) f <N=MfZ  is  M+1f 
nosubscriPts<NFZ) ) . 


/###*#**####**##***##*##*#*#***#*#*##**#**#****##*/ 
/*  #/ 
/#  The  SOP  formula  to  be  put  in  Blake  Canonical  #/ 
/#  Form  is  represented  as  a  list  F  of  lists  and  #/ 
/#  each  of  the  lists  in  F  corresponds  to  a  term  #/ 
/#  in  the  SOP  formula.  *  The  method  of  iterated  #/ 
/#  consensus  is  used.  doitall  is  a  procedure  to  #/ 


/#  time  the  routine  for  propositional  logic  to  #/ 
/#  BCF  translation,  doit  is  a  routine  to  read  #/ 
/#  in  the  propositional  logic  statements  and  */ 
/#  return  the  BCF  of  them,  doitl  converts  the  #/ 
/#  propositions  into  a  SOP  form,  bcfs  inputs  a  #/ 
/#  list  of  lists  and  outputs  the  BCF  of  this  list#/ 
/#  as  a  list  of  lists,  parseit  parses  the  SOP  #/ 
/#  formula  into  a  list  of  lists.  #/ 


108 


/*  */ 

/*  keys  is  a  routine  to  input  the  information  */ 
/*  needed  to  locate  the  keys  of  a  relation.  The  */ 
/*  BCF  form  of  the  functional  dependencies  must  */ 
/*  be  stored  in  the  file  BLAKE.  This  routine  */ 

/*  also  prints  out  the  time  for  the  execution  */ 

/*  and  the  keys.  L  is  the  attribute  symbols.  */ 

/*  */ 


/ft************************************************/ 

sol ve_f or_ke«s <L) J -doi tal 1 *keys (L )  . 

doitall t-iimeO< T) *doit* timeO< T1 ) .Time  is  Tl-T* 
writei'Time  for  bcf  is  ')*nl»  write(Time)* 
write( 'ms. ' ) *nl ♦ 

doi t * -see ( propa ) *  read(X) * < compare (=*X*end_of_file) * 
trans(XrZ) .tell (dat) * 

write(Z). doitl *nl » seen .told* pa rseit* bcf s) . 
doitl  J-read(X) » < compare <=**X»erid_of_f  ile)  * 
write( '  +  ' )  * 

nl*trans<X»Z) »write<Z>  * doitl) . 
parse it {-see (dat ) .set (C) *Setstr<C*X)  * 
pa rse(P»X*  C3) * 

tell<list)*write<P)»write(/./) * seen* told, 
bcfs t-see< list ) *  read (X) *  seen* bcf  <  X  *Z ) * 
tell<blake) .write (Z) *  write < ' . ' )*told. 
keys<L) J-timeO(T) *  see <b lake) *  read(Y) *seen* 
find_the-keys(L*Y*N) *timeO<Tl ) » 

Time  is  Tl-T* write (' Time  for  key  search  is  ')* 
write(Time)*write('  ms.')*nl * write_keysl (N) . 
bcfit(CX!L3*CZ!Kl)J -timeO ( T ) * 
bcf (CX!L3»CZ!K3) *timeO<Tl)  * 

Time  is  Tl-T*write( 'TIME  IS  ')*nl* 
write(Time)*write<'ms'). 

/*************************************************/ 
/*  */ 

/*  bcf(A*B)  returns  B  =  BCF(A).  */ 

/*  */ 

/*************************************************/ 

bcf <  CX ! LD » CZIK3  >  J-testit<  C3*  EX ! LD  »M»N) > 
bcf 1 ( CX ! L3  *  CM3  »N»  CZ ! K3 ) ♦ 


/*  */ 

/*  bcfl(Af BfCfD)  generates  a  list  of  consensus  */ 

/*  lists  between  the  list  C  and  the  list  of  */ 

/*  lists  B.  The  list  of  consensus  lists  and  the  */ 

/*  list  B  of  lists  are  tested  for  absorptions.  */ 

/*  The  new  left  part  of  the  list  A  along  with  the*/ 

/*  next  list  of  B  is  determined  and  bcfl  is  */ 

/*  called  adain.  The  BCF  of  the  list  A  of  lists  */ 
/*  is  returned.  */ 

/*  */ 


/*************************************************/ 
bcfl <CX!L3»CM!RD»CD»CX!L3) . 

bcfl <CX!L3fCM!R3»N»CZ!K3) > -consc  CCM!R]»NfC> r 
abspr(C»  CX!L3»CYiS3)» 

(member (NfCY! S3 > f startl (Nf CY i S3 fMI fNI > » 
bcfl (CY!S3fM1fN1fCZ!K3>f 
reverse  (CM!R3fCX2‘ L23 ) ? 
testit(CX2!L23fCY!S3»HlfNl>  f 
bcfl(CY!S3»Ml»Nl,CZ!KJ>) . 

yt  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  |||  y# 


/*  */ 

/*  abspr(AfBfC)  checks  for  absorptions  between  */ 

/*  the  list  A  of  */ 

/*  consensus  lists  and  the  old  list  of  lists  B*  */ 
/*  C  is  the  new  */ 

/*  list  of  lists  formed  after  all  absorptions.  */ 

/*  */ 


/♦a********#**#***#****#**************************/ 

abspr<CY!S3f CTf CHIT3) :-abspr(SF CY3fCH! T3> . 
abspr<C3»CXIL3»CX!L3) . 

abspr<CY!S3»CX!LDfCZ!K3) :-absp<YfCXiL3»CXl !L1]) r 
ab SPr(SrCXl !L13fCZSK3>. 

/*************************************************/ 


/*  */ 

/*  start! (AfBfCfD)  determines  where  the  old  next  */ 
/*  list  A  is  in  the  new  list  B  of  listsF  and  */ 

/*  returns  both  the  new  next  list  D  and  the  new  */ 
/*  left  part  C  of  the  list  of  lists.  */ 

/*  */ 


/*************************************************/ 


startl(N»C3f C3»C3) . 
startl<Nf CY!S3»CM13»N1) :-N=Yf 
Ml=Nf CN1 !L3*S. 

start 1 <N»CY!SDfCY! L3fNl){ -start l<N»S»LfNl). 

/*««***»**»****«*****«*«*************«««««**«*«**«/ 
/*  */ 

/*  testit(A»B»Cf D>  determines  if  the  first  list  */ 
/%  of  the  list  A  of  lists  is  in  the  new  list  B.  #/ 
/*  The  new  left  part  C  and  the  new  next  term  D  #/ 

/%  is  returned.  %/ 

/%  %/ 

/  ^  'V  ^  ^  ^  ^  ^  ^  ^  ^  ^  sW  ^  ^  ^  ^  <b  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  vV  ^  ^  ^  / 

/  ^  ■*  ^  ^  ^  ^  *n  *  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  / 

test it (CX2!L23»CY: S3 >M1,N1 ) : -member <X2»CY! S3) , 
startl <X2»CY!S3»M1»N1) • 

testit<CX2!L23»CY!S3»Ml*Nl) t-testit(L2f CY!S]»M1»N1) 
testit(C3»CY!S3»Y>X) *  -CX  • L3=S . 
test it ( CD»CY3»CY]»C1)  . 

/  jfc  jjc  ^  if!  ^  ^  y 

/#  */ 

/#  reverse(A»B>  returns  B  the  reverse  order  of  A  %/ 

/%  */ 

/%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%/ 

reverse ( CXHfCXJ). 
reverse <CX»Y3»CY»X3). 

reverse <  CX ! R3  >L ) J-re  erse< R»L1 ) »concat (LI t CXI »L ) • 

/******#*****#************###****************#****/ 
/*  */ 

/%  concat(A»B*C)  returns  the  list  C»  comprised  */ 

/#  of  the  list  B  appended  to  the  list  A.  %/ 

/%  */ 

/I##**#***##*#**##*#*#**###**#**##*****##**#*#***##/ 

concat( C3 »L»L> ♦ 
concat(CF!L13»L2»CF!L3D) 
concat(Ll »L2»L3) . 


/********** t**************************************/ 
/*  %/ 
/%  Setstr(AfB)  returns  a  list  of  ascii  %/ 
/*  characters.  %/ 
/*  */ 
/%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%/ 


Ill 


Setstr<26*C3) {-!  ♦ 

Setstr <C*  CC i R1 ) i-Set <C2) *Setstr<C2*R> * 
/*t*****%****t***#t**%****************************/ 


/%  */ 

/%  time  is  s  routine  to  call  the  internal  */ 

/*  timer  of  the  system*  %/ 

/*  */ 


/ft#*********##******#********###****#*##*#**##***#/ 

timeO<T> *--st3tistics<  runtime*  CT*_1>  ♦ 
timei-statistics< runtime*  C_»T3)*write(T)*nl* 


/I**###*#*##****##**#*************#*#***#**###**#**/ 


/*  */ 

/*  nes(A*B>  is  a  routine  that  returns  the  #/ 

/%  complement  B  of  a  %/ 

/%  boolean  expression  A*  */ 

/*  */ 


/%%%%%%%%%%%%%%%%%%%%*■*%*%%%%%%%%%%%%%%%%%%%%%%.%%%/ 

nea(XSY*A+B)  »-nesj<XrA>  »nas(Yi>B)  * !  . 
nee  <  X+Y  *  A*B ) «  -nea X  ?  A  >  r  nes  <  Y  »  B  >  » !  . 
ned(X'*Y) J-Y=X» i . 
nee(X*Y)i-Y*X'  *!, 

/  £$$  /j(  )j(  )|(  )|(  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  «|g  J 


/%  */ 

/%  simp<A*B>  and  mult(C*D>  are  routines  that  */ 

/%  return  the  sum  of  products  form  of  a  formula  */ 
/*  that  is  in  product  of  sums  form*  */ 

/*  */ 


/#**###**###*********#**##******##******##*#****#*/ 

simp(XSX*X) . 
simp<X+X*X) J-! . 

simp(X+Y»Z) :-simp(X»R) *simp< Y*S> *  <Z)*<R+S>  » ! ♦ 
si»p<XlY*Z) :-simp(X*R) *simp<Y*S) *mult(R*S*Z) » ! . 
simp(X»X>  t-! ♦ 
mult (X*X*X) J-! ♦ 

mult<A+B»C+D*Z>  J-mult < A.C+D* Y) » 
mult<B*C+D*X)*(Z>»<X+Y>f ! . 
mult( A&B* A* A&B) :-! * 
mult ( A2B*B»A&B) ♦-! . 
mult( A* A&B* A&B) :-! * 
mult<B*A*B»A»B>  J-! . 

mult ( A+B*C*Z) :-mult<A»C*X) *mult<B»C»Y) » <Z)*<X+Y>  * ! . 
mult(C*A+B*Z) *-mult(C*A»X> *mult(C*B*Y) * (Z>*<X+Y) » ! . 
mult<X»Y*X*Y>  J-! * 
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/*  */ 

/%  trans(ArB)  is  a  routine  that  translates  a  #/ 

/*  propositional  logic  statement  A  into  a  SOP  */ 

/%  formula  B.  %/ 

/%  %/ 


trans< V=>Y»Z) :-neS(Y»W> fSimp(V*WrZ> . 


/%  %/ 

/*  absp(A»B»C)  returns  C»  a  list  of  lists  */ 

/%  corresponding  to  a  SOP  formula.  C  is  the  #/ 

/#  result  of  performing  absorption  on  the  SOP  */ 

/*  formula  B  with  the  list  A.  A  is  a  list  and  .  */ 

/*  B  is  a  list  of  lists.  */ 

/*  */ 


/%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%/ 

absp<  C1.CX!L3»CX!L3) 5  - ! . 
absp(ArC J iCAl) J-! . 

absp( A. CX i L3 » CX ! LJ ) J-sublist (X»A> » !  . 

absp( A. CX ! L2 f CZ ! R1 ) J-sublist ( A?X) » absp < A. L » CZ ! R1 ) » ! . 
absp(AfCX!L3»CX!R2):-absp(ArL»R)r ! . 

/#******# 


/*  %/ 

/%  sublist<A*B)  determines  if  the  list  A  is  */ 

/%  contained  in  the  list  B.  */ 

/*  */ 


/*#**********##**********************#******#*****/ 

sublist < CXIL3»CY!SD)I -member ( X » CY ! S3  > » 
sublist <Lf  CY !  S3 )  » !  , 
sublist<  C3fM)J-!. 

/%%%%%%*%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%\/ 


/*  %/ 

/%  member(AfB)  determines  if  the  element  A  is  %/ 

/%  contained  in  the  list  B.  */ 

/*  */ 


/%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%/ 

member (Xf  CX i R3 >  t  —  I  . 

member <X» CY R3 )  .-member <X »R>  ? !  . 

member<C3»X> J-! . 


i 4 
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/*************************************************/ 
/*  %/ 
/*  consc<A»B»C>  returns  3  list  C  of  lists  */ 

/*  consisting  of  all  consensus  terms  between  the  */ 
/*  list  A  and  every  list  contained  in  the  list  B  */ 
/%  of  lists*  List  C  consists  of  non-empty  lists*  */ 
/%  */ 
/*************************************************/ 

consc<  CX!L]fYfCZ!K]>* -cons (CX!L]fYfCM!L1])f 
delete< CM ! L13 » CZ ! K3 > . 

/*  */ 
/*  delete(AfB)  removes  any  empty  lists  contained  */ 
/%  in  the  list  of  lists  A  and  returns  the  list  of*/ 
/*  lists  B  which  is  void  of  any  empty  lists*  */ 

/*  */ 

delete < CMILllrCZlK]) 5-<compare( =fL1 » C C33  > » 
deletel <Mf  CZ ! K3 ) f 
compare  <*fL1 f C] ) f CZ ! KJ«CM { LI  It 
Z=M f delete < LI rK) ) . 
deletel (Mf  CM] ) * 

/*************************; a#**********************/ 
/*  */ 

/*  cans<AfBrC)  returns  a  list  C  of  lists  */ 

/*  consisting  of  all  consensus  terms  between  the  */ 
/*  list  A  and  every  list  contained  in  the  list  B  */ 
/*  of  lists.  The  list  C  may  contain  empty  lists*  */ 
/*  */ 

cons(C]fYfX) J-C]*. *X. 

cons< CX!L]fYfCM!L1]>  J-testconsl (XfYfYfZ) r 
(compare<  =  rZFC]) fcons ( Lf Y » CM  1  LI D ) r ! f 
compare ( a  >  Z  f CC]] ) f ! » consCL* Y f CM ! LI  3 > * 

M»Z  f  cons  (LfYfLD). 

cons ( CX ! L3 f Y f CZ ! K3 ) t-cons<Lr Yf CZ ! K] > » 


/*************************************************/ 


/*  */ 

/*  testconsl ( A * B * C * D )  tests  for  a  literal  in  */ 
/*  opposition  between  the  list  A  and  C.  0  is  the*/ 
/*  returned  consensus  term*  B  and  C  are  */ 

/*  are  identical  when  testconsl  is  originally  */ 
/*  called*  If  no  literal  in  opposition  is  found**/ 
/*  D  is  set  to  the  empty  list*  If  a  literal  in  */ 
/*  opposition  is  found  testcons2(|J*U*X» Y*Z>  */ 

/*  is  called*  */ 

/*  */ 


testconsl <CAi Cl » Y» Z1 * C3> . 

testconsl <CA! CD *Y* CB ! D3»X > l -sublist (CB3*CAiC3 ) * 
testconsl <  CA ! Cl * Y » D  *  X  > . 
testconsl <CAi Cl *Y*CBi D3 .X >  :-neS(B* G) * 
sublist <  CG3*CAiC3)* 
testcons2<CA:c:i*Y*D*CB*G3*X> * 
testconsl < CA ! C3 »Y> CB ! D3 »X) J-testconsl < CA ! Cl* Y*D»X> . 

/a****#****#***#****#*************#****#**********/ 


/*  */ 

/*  testcons2<A*B*C*D*E)  tests  for  a  second  */ 

/*  literal  in  opposition  between  list  A  and  C»  */ 

/*  Lists  A  and  B  are  the  original  lists*  list  D  */ 

/*  contains  the  first  literal  in  opposition  and  */ 
/*  its  complement*  and  list  E  contains  the  */ 

/*  consensus  term.  If  a  second  literal  in  */ 

/*  opposition  is  found  »  E  is  the  empty  list.  */ 

/*  */ 


/a**********#**#******************#***************/ 

testeons2(CA!C3*Y*C:»CB*G3»CZ:K3>:- 
formitl<CA!C3»Y,CB*G3*CZiKD) . 
testcons2<CAiC3*Y* CE!L3*CB*G3*X> !-neS<E*F>  * 
sublist <  CF] *  CA ! Cl ) *  Cl  =..X. 
testcons2 ( CA ! C3 * Y* CE ! L3 * CB *G3 * CZ i K3 ) »- 
testcons2( CA !C3*Y*L*CB*G3*CZ!K3). 

/*************************************************/ 


/*  */ 

/*  formitl<A*B*C*D)  places  all  non-opposition  */ 

■/*  literals  in  list  B  into  the  consensus  list  0*  */ 
/*  List  C  contains  the  opposition  literal  and  */ 

/*  its  complement*  */ 

/*  */ 


/*****************«*«*****************************/ 
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for«itl(CAJC3»C3»CBfG3»CZ}K3> 1-. 

for*it2<CAiC3f  CB»G3f  CZ.'KD) . 
formitl(CA!C3»CD:E3»CBfG3f CZIKD) :-B*D» 
fornitl(CA!C3fE»CB»G3fCZ!K3>. 
farmitl(CAiC3»CDSE3 fCBfG3fCZ!K3)*- 
sublist(C03»CA!C3>  f 
formitl <CA:C3fEfCBfG3fCZ<K3>» 
for*itl<CA!C3rCDf  ED»CB»G3fCD,’K3) 
formitl <CA!C3fEfCBfG3fK) . 

/4c*«***««*««**««***********«*K(t*«**««»******»»****/ 
/*  %/ 

/%  formit2(AfB»C»D)  places  all  non-opposition  %/ 

/%  literals  of  list  A  into  the  consensus  list  C.  %/ 
/%  List  B  contains  the  opposition  literal  and  %/ 

/%  its  complement*  %/ 

/%  */ 

/t*XXtX*XXXXXX**X****X**X**XX**tX*t*****X**X**XXX*/ 

formit2(C3fCBfG3fC3)  ♦ 

formit2(CA!ClFCBFG3FX) :-G=AFformit3(CFX) * 
formit2(CA!C3FCBrG3»CA!K3> • -f ormi t2 (CfCBfGIfK) . 

/Xt**************t****X******t********************/ 
/*  */ 

/%  formit3<AFB>  sets  the  tail  of  A  of  a  list  to  %/ 

/%  be  the  list  B.  #/ 

/%  %/ 

/%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%/ 

formit3(CFC) . 

/%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%/ 
/%  t/ 

/#  parse(Z)  returns  Z  a  list  of  list  from  a  */ 

/*  SOP  formula.  */ 

/*  */ 

/#*#*###***#*! *t***********************************/ 

parse<Z>  — >  term(X> » ■+• t restparse(XFZ) . 
parse<CZ3)-->ter*(Z>« 

/g****^***********^***^************^**********/ 
/*  */ 

/*  term(Z)  converts  a  term  in  the  SOP  formula  #/ 

/*  into  a  list  «  */ 

/*  #/ 

/**iM*********************************************/ 
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term<Z)  — >  delim ? token<X ) ?delim? resterm(X?Z) ?  ! ♦ 


/*************************************************/ 
/%  %/ 

/*  delim  removes  the  following  symbols  from  the  %/ 
/*  list  created  by  term<Z)?  blanks?  S?  +?  '»  (?>.*/ 
/*  */ 


delim 

delim 

delim 

delim 

delim 


— >  *  S*  ? delim? !  • 
— >  *  *  ?delim? ! . 
— >  • ( • ? delim?  !  . 
— >  * ) ■ ?del im? !  . 
— >  Cl?!. 


/*************************************************/ 
/%  */ 

/%  resterm(X?Y)  sets  the  first  element  of  list  Y  */ 
/*  to  be  the  element  X»  and  calls  term  to  find  */ 
/*  the  rest  of  the  list  Y.  */ 

/*  */ 


resterm(X? CXiRl)  — >  term(R). 
resterm(X?CXD)  — >  Cl. 


/*  */ 

/*  token(Y)  returns  an  atom  Y  for  a  member  of  */ 

/%  ascii  list.  #/ 

/*  */ 


token< Y)  -->  CXI  ?  C961  ?-Cname<  Y?  CX?961 )  >? ! . 
token< Y)  — >  CXl?<name(Y?CXl) ?Y\=='+/>? ! . 

/*************************************************/ 


/*  */ 

/*  restparse<X»Y>  sets  the  first  element  of  the  */ 
/#  list  of  lists  Y  to  be  the  list  X?  which  %/ 

/%  corresponds  to  first  term  of  the  SOP  formula.  #/ 
/%  restparse  then  calls  parse.  */ 

/%  */ 


/*************************************************/ 

restparse<X»CX!Rl)  -->  parse(R). 
restparse <X» CXI)  — >  Cl. 
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/*************************************************/ 


/*  */ 

/*  f ind_the_kevs( A* B*C>  inputs  a  list  A  of  */ 

/%  symbols  representing  all  of  the  attributes  */ 

/*  for  a  relation  in  a  database*  and  a  list  B  */ 
/*  which  corresponds  to  the  BCF  of  the  functional*/ 
/*  dependencies  of  the  relation*  The  list  C  */ 

/*  contains  the  list  of  keys  for  the  relation*  */ 
/*  */ 


/*************************************************/ 

find_the_keas<CXl!L13*CX!L3»CF:Rl)t- 
bcf 1 (CX!L1*CX!L1*CX1!L11*CZIK3)* 
detenninetCZJKl.CFJRl) * 

/*************************************************/ 


/*  */ 

/*  determine(A»8)  inputs  the  list  A  of  lists  */ 

/*  corresponding  to  the  Blake  Canonical  Form  of  */ 
/*  the  functional  dependencies  and  list  of  */ 

/*  attribute  symbols  for  a  relation*  The  list  */ 
/*  of  keys  B  is  returned*  */ 

/*  */ 


/ft************************************************/ 

determine*;  Cl  *  Cl )  . 

determine<CZ!Kl*N) J-<nedsin<Z*M) * 

N=CX!  LI  *X=«M»  determine  <K*L)  * 
determined*  N) )  • 

/ft************************************************/ 


/*  */ 

/*  nedsin<A*B)  inputs  a  list  A  of  lists  and  */ 

/*  outputs  a  list  B  of  lists  that  contain  all  */ 

/*  the  lists  of  A  that  have  no  complemented  */ 

/*  literals  as  members*  The  list  B  is  a  list  of  */ 
/*  keys  of  varying  length*  */ 

/*  */ 


/ft***********************************************/ 


nessin(Cl*Cl) . 

negsin<  CX!L1»CH!T1)*-CX1*CM'1» ! *fail* 
nessin<CXJLl»CXITl> 5-negsin<L*T> . 


J 
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/*txtt****************#*x************t**t*t*******/ 

/*  */ 

/%  write-kewsl < A> t  write_keys< A) t  write_keys2<B)  */ 

/%  output  the  keys  and  a  heading  for  them*  */ 

/*  */ 

/%%%%%%%%%%%*%%%%%%%%*%%*%%%%%%%%%%%%%%%%%%*%%%%%%/ 

write_kevsl<CH!Tl>  *-write(  'The  keys  are*  ')» 
nl i write-keys ( CHi T3 ) » 

write-keys < C3 ) *-nl . 

write_kews( CH! T3) J-write_keys2(H) »write-keys(T) . 


APPENDIX  C 


EXECUTIONS  OF  THE  FD-KEY  ALGORITHM 


•  dbase 

Cal »bl >clrdl3« 

Ca2.b3.cl . dl3  < 

Cal . b2.c2.d23 » 

Cal»bl»cl.d23. 

Ca2.b3. cl » d23 » 

Cal » b2 . c2 .  dl  3 . 

♦run  Prolog 

ProloS-10  version  3 

C  Consulting  'Prolog. ini '  3 
!  ?-  restore < cfd )  . 

C  closing  all  active  files  3 

C  restore  complete  3 

yes 

!  ?-  mainthins< Ca. b*c»d3 > . 

Time  for  functional  dependency  generation  is 
4251ms 

Time  for  bcf  is 
172ms. 

Time  for  key  search  is  331  ms. 

The  keys  are. 

dca 

db 

yes 

yes 

!  ?-  halt. 

C  Prolog  execution  halted  3 
EXIT 


.type  propa 
ba*>a. 
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.  t ape  dbase 

Csmith> surgery > 4 >b2> 63. 

C Jones > pathology? 2> al >33 . 

CJones? pathology?  I>c3>53. 

C evens? anatomy >  2>c3> 103 . 

C Jones > pathology »2> al >73 • 

Cevans ? surgery >3? a2? 33 . 

C smith > anatomy >5 >al > 53 • 

.run  Prolog 

Proloa-10  version  3 

C  Consulting  'proloa.ini'  3 
!  ?-  restore(cfd)  . 

C  closing  all  active  files  3 

C  restore  complete  3 

yes 

!  ?-  mainthina(Cp>c>y> r>t3  )  . 

Time  for  functional  dependency  generation  is 
17148ms 

Time  for  bcf  is 
15418ms. 

Time  for  key  search  is  16090  ms. 

The  keys  are? 

rt 

yt 

r-t 

ct 


yes 

!  ?-  halt. 

C  Prolog  execution  halted  3 


EXIT 


124 


.type  p pops 
a=>b . 
c->bSd» 
b»>eSf . 
e=>d. 
f=>aSd. 


.run  Prolog 

Pro log- 10  version  3 

C  Consulting  "Prolog. ini'  3 
!  ?-  restore(cfd) . 

C  closing  all  active  files  3 

C  restore  complete  3 

yes 

!  ?-  solve_for_keys<  Cafbf  ci>d>e>f  3  )  . 
Time  for  bcf  is 

1432m?. 

Time  for  key  search  is  5971  ms. 

The  keys  are? 
c 


yes 

!  ?-  halt. 

C  Prolog  execution  halted  3 


EXIT 
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